PCX /j^^WORLD INTELLECTUAL PROPERTY ORGANIZA'^B| 

\^PP^ Internationa] Bureau 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 




(51) International Patent Classification ^ 

C12N 15/31, C07K 14/35, C12N 15/62, 
GOIN 33/569, C12Q 1/68 



A2 



(11) InternaUonal PubUcation Number: WO 97/09429 

(43) International Publication Date: 13 March 1997 (13.03.97) 



(21) International Application Number: PCT/US96/ 14675 

(22) International Filing Date: 30 August 1996 (30.08.96) 



(30) Priority Data: 

08/523.435 
08/532,136 
08/620,280 
08/658,800 
08/680,573 



1 September 1995 (01.09.95) US 

22 September 1995 (22.09.95) US 

22 March 1996 (22.03.96) US 

5 June 1996 (05.06.96) US 

12 July 1996 (12.07.96) US 



(71) Applicant: CORDCA CORPORATION [US/US]; Suite 464, 

1 124 Columbia Street, Seattle, WA 98104 (US). 

(72) Inventors: REED. Steven. G.; 2843 - 122nd Place NJE, 
Bcllevue, WA 98005 (US). SKEIKY, Yasir. A., W.; 8327 
- 25th Avenue N.W.. Seattle. WA 98117 (US). DILLON, 
Davin. C; 21607 N.E 24th Street. Redmond, WA 98053 
(US). CAMPOS-NETO, Antonio; 9308 N.E. Midship 
Court, Bainbridge Island, WA 98110 (US). HOUGHTON. 
Raymond; 2636 - 242nd Place SJE.. Bothell. WA 98021 
(US). VEDVICK. Thomas, H.; 1301 Spring Street, Seattle. 
WA 98104 (US). TWARDZIK. Daniel. R.; 10195 South 
Beach Drive, Bainbridge Island, WA 98110 (US). 



(74) Agents: MAKI, David, J. et al.; Seed and Beiry L.L.P.. 6300 
Columbia Center, 701 Fifth Avenue, Seattle, WA 98104- 
7092 (US). 



(81) Designated States: AL, AM, AT, AU, BB, BG. BR, BY, CA, 
CH, CN. CZ, DE, DK, EE. ES, GB. GE, HU, IL, IS, JP, 
KE, KG, KP. KR. KZ. LK, LR. LS, LT. LU, LV. MD, MG, 
MK, MN, MW, MX. NO. NZ. PL. PT, RO. RU, SD, SE. 
SG, SI, SK, TJ, TM, TR. TT, UA, UG, UZ. VN, ARIPO 
patent (KE, LS. MW. SD, SZ. UG). Eurasian patent (AM, 
AZ, BY, KG. KZ. MD. RU, TJ. TM), European patent (AT. 
BE, CH, DE. DK, ES, FI. FR. GB, GR. IE. IT, LU, MC, 
NL, PT, SE). OAPI patent (BF. BJ, CF, CG, Q, CM, GA, 
GN, ML, MR, NE, SN. TO. TG). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Titie: COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 
1.400- 



1.000- 




(57) Abstract 



100 

RECOMBINArfT NG/WEa 



1000 



Compounds and methods for diagnosing tuberculosis arc disclosed. The compounds provided include polypeptides that contain at least 
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COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 
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Technical Field 

The present invention relates generally to the detection of 
Mycobacterium tuberculosis infection. The invention is more particularly related to 
15 polypeptides comprising a Mycobacterium tuberculosis antigen, or a portion or other 
variant thereof, and the use of such polypeptides for the serodiagnosis of 
Mycobacterium tuberculosis infection. 

Background of the Invention 
20 Tuberculosis is a chronic, infectious disease, that is generally caused by 

infection with Mycobacterium tuberculosis. It is a major disease in developing 
coimtries, as well as an increasing problem in developed areas of the world, with about 
8 million new cases and 3 million deaths each year. Although the infection may be 
asymptomatic for a considerable period of time, the disease is most commonly 
25 manifested as an acute inflammation of the lungs, resulting in fever and a nonproductive 

*^ cough. If left imtreated, serious complications and death typically result. 

) Although tuberculosis can generally be controlled using extended 

antibiotic therapy, such treatment is not sufficient to prevent the spread of the disease. 
Infected individuals may be asymptomatic, but contagious, for some time. In addition. 
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although compliance with the treatment regimen is critical, patient behavior is difficult 
to monitor. Some patients do not complete the course of treatment, which can lead to 
ineffective treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis will require effective vaccination 
5 and accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is 
the most efficient method for inducing protective immunity. The most common 
Mycobacterium for this purpose is Bacillus Cahnette-Guerin (BCG), an avinilent strain 
of Mycobacterium bovis. However, the safety and efficacy of BCG is a source of 
controversy and some countries, such as die United States, do not vaccinate the general 
10 public. Diagnosis is commonly achieved using a skin test, which involves intradermal 
exposure to tuberculin PPD (protein-purified derivative). Antigen-specific T cell 
responses result in measurable incubation at the injection site by 48-72 hours after 
injection, which indicates exposure to Mycobacterial antigens. Sensitivity and 
specificity have, however, been a problem with this test, and individuals vaccinated 
1 5 with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
M. tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M. tuberculosis infection is illustrated by 
the frequent occurrence of M. tuberculosis in AIDS patients, due to the depletion of 
20 CD4 T cells associated with human immunodeficiency virus (HIV) infection. 
Mycobacterium-reactive CD4 T cells have been shown to be potent producers of 
gamma-interferon (IFN-y), which, in turn, has been shown to trigger the anti- 
mycobacterial effects of macrophages in mice. While the role of IFN-y in humans is 
less clear, studies have shown that 1,25-dihydroxy.vitamin D3, either alone or in 
25 combination with IFN-y or tumor necrosis factor-alpha, activates human macrophages 
to inhibit M. tuberculosis infection. Furtheraiore, it is known that IFN-y stimulates 
human macrophages to make 1,25-dihydroxy-vitamin D3. Similarly, IL-12 has been 
shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 
of the immunology of M. tuberculosis infection see Chan and Kaufinann, in 
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Tuberculosis: Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, 
Washington, DC, 1994. 

Accordingly, there is a need in the art for improved diagnostic methods 
for detecting tuberculosis, the present invention fulfills this need and further provides 
5 other related advantages. 

Summary of the Invention 

Briefly stated, the present invention provides compositions and methods 
for diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an 
10 antigenic portion of a soluble M. tuberculosis antigen, or a variant of such an antigen 
that differs only in conservative substitutions and/or modifications. In one embodiment 
of this aspect, the soluble antigen has one of the foUowing N-teiminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-lle-Asn-Thr-Thr-Cys-Asn-Tyr-GIy- 
Ghi-Val-Val-Ala-Ala-Leu (SEQ ID No. 1 15); 

0>) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser (SEQ ID No. 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-GIy-Pro-Leu-GIu-Ala- 
Ala-Lys-Glu-Gly-Arg (SEQ ID No. 1 1 7); 

(d) Tyr-Tyr-Trp-Cys-Pro-GIy-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
20 Pro (SEQ ID No. 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Ghi-Gln-Xaa-Ala-Val 
(SEQ ID No. 119); 

(0 Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
No. 120); 

25 (g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 

Pro-Pro-Ser (SEQ ID No. 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-GIy-Thr-Asp-Thr- 
Gly (SEQ ID No. 122); 
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(i) Asp-Pro-Ala-Ser-AIa-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 

Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 
Ala-Asn (SEQ ID No. 123); 

0) Xaa-Asp-Ser-Glu-Lys-Ser-AIa-Thr-IIe-Lys-Val-Thr-Asp-Ala- 
Ser; (SEQ ID No. 129) 

(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
Asp;(SEQIDNo. 130) or 

(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 131) 
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wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an 
immunogenic portion of an M tuberculosis antigen, or a variant of such an antigen that 
differs only in conservative substitutions and/or modifications, the antigen having one 
15 of the following N-terminal sequences: 

(m) Xaa-Tyr-IIe-AIa-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 1 32) or 

(n) Asp-Pro-Pro-Asp-Pro-His-Ghi-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-GIy-Arg-Arg-Xaa-Phe; (SEQ ID No. 124) 
20 wherein Xaa may be any amino acid. 

In another embodiment, the antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences 
recited in SEQ ID Nos. 1. 2, 4-10, 13-25, 52, 94 and 96, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos. 1, 
25 2, 4-10, 13-25, 52, 94 and 96 or a complement thereof under moderately stringent 
conditions. 

In a related aspect, the polypeptides comprise an antigenic portion of a 
M. tuberculosis antigen, or a variant of such an antigen that differs only in conservative 
substitutions and/or modifications, wherein the antigen comprises an amino acid 
30 sequence encoded by a DNA sequence selected from the group consisting of the 



wo 97/09429 



5 



PCT/US96/14675 



sequences recited in SEQ IDNos. 26-51, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID Nos. 26-51 or a complement 
thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
5 recombinant expression vectors comprising these DNA sequences and host cells 
transformed or transfected with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M. tuberculosis antigen. 
10 In further aspects of the subject invention, metiiods and diagnostic kits 

are provided for detecting tuberculosis in a patient. The methods comprise: 

(a) contacting a biological sample with at least one of the above polypeptides; and 

(b) detecting in tiie sample tiie presence of antibodies diat bind to tiie polypeptide or 
polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

15 Suitable biological samples include whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. The diagnostic kits comprise one or more of tiie above 
polypeptides in combination witii a detection reagent. 

The present invention also provides metiiods for detecting 
M. tuberculosis infection comprising: (a) obtaining a biological sample from a patient; 

20 (b) contacting tiie sample witii a first and a second oligonucleotide primer in a 
polymerase chain reaction, tiie first and tiie second oligonucleotide primers comprising 
at least about 10 contiguous nucleotides of a DNA sequence encoding tiie above 
polypeptides; and (c) detecting in tiie sample a DNA sequence tiiat amplifies in tiie 
presence of the first and second oligonucleotide primers. 

I" a further aspect, tiie present invention provides a metiiod for detecting 
M. tuberculosis infection in a patient comprising: (a) obtaining a biological sample 
from tiie patient; (b) contacting tfie sample witii an oligonucleotide probe comprising at 
least about 15 contiguous nucleotides of a DNA sequence encoding tiie above 
polypeptides; and (c) detecting in tiie sample a DNA sequence tiiat hybridizes to tiie 

30 oligonucleotide probe. 
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In yet another aspect, the present invention provides antibodies, both 
polyclonal and monoclonal, that bind to the polypeptides described above, as well as 
methods for their use in the detection of M. tuberculosis infection. 

These and other aspects of the present invention will become apparent 
5 upon reference to the following detailed description and attached drawings. All 
referefaces disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

Brief Description of the Drawings and Sequence Identifiers 

10 ] Figure 1 A and B illustrate the stimulation of proliferation and interferon- 

Y production in T cells derived from a first and a second M tubercuIosisAmmme donon 
respectivelv. bv the 14 Kd. 20 Kd and 26 Kd antigens described in Example 1 . 

Figure 2 illustrates the reactivity of two representative polypeptides with 
sera from M tuberculosis-infected and uninfected individuals, as compared to the 
15 reactivity of bacterial lysate. 

Figure 3 shows the reactivity of four representative polypeptides with 
sera from M tuberculosis-infected and uninfected individuals, as compared to the 
reactivity of the 3 8 kD antigen. 

Figure 4 shows the reactivity of recombinant 38 kD and TbRall 
20 antigens with sera from M. tuberculosis patients, PPD positive donors and normal 
donors. 

Figures shows the reactivity of the antigen TbRa2A with 38 kD 

negative sera. 

Figure 6 shows the reactivity of the antigen of SEQ ED No. 60 with sera 
25 from M, tuberculosis patients and normal donors. 

SEQ. ID NO. 1 is the DNA sequence of TbRal. 
SEQ. ID NO. 2 is the DNA sequence of TbRalO. 
SEQ. ID NO. 3 is the DNA sequence of TbRal 1 . 
SEQ. ID NO. 4 is the DNA sequence of TbRal2. 
30 SEQ. ID NO. 5 is the DNA sequence of TbRal3. 

SEQ. ID NO. 6 is the DNA sequence of TbRal 6. 
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7 

SEQ. ID NO. 7 is the DNA sequence of TbRal7. 
SEQ. ID NO. 8 is the DNA sequence of TbRal8. 
SEQ. ID NO. 9 is the DNA sequence of TbRal 9. 
SEQ. ID NO. 10 is the DNA sequence of TbRa24. 
5 SEQ. ID NO. 1 1 is the DNA sequence of TbRa26. 

SEQ. ID NO. 12 is the DNA sequence of TbRa28. 
SEQ. ID NO. 13 is the DNA sequence of TbRa29. 
SEQ. ID NO. 14 is the DNA sequence of TbRa2A. 
SEQ. ID NO. 15 is the DNA sequence of TbRa3. 
10 SEQ. ID NO. 16 is the DNA sequence of TbRa32. 

SEQ. ID NO. 17 is the DNA sequence of TbRa35. 
SEQ. ID NO. 18 is the DNA sequence of TbRa36. 
SEQ. ID NO. 19 is the DNA sequence of TbRa4. 
SEQ. ID NO. 20 is the DNA sequence of TbRa9. 
15 SEQ. ID NO. 21 is the DNA sequence of TbRaB. 

SEQ. ID NO. 22 is the DNA sequence of TbRaC. 
SEQ. ID NO. 23 is the DNA sequence of TbRaD. 
SEQ. ID NO. 24 is the DNA sequence of YYWCPG. 
SEQ. ID NO. 25 is the DNA sequence of AAMK. 
20 SEQ. ID NO. 26 is the DNA sequence of TbL-23. 

SEQ. ID NO. 27 is the DNA sequence of TbL-24. 
SEQ. ID NO. 28 is the DNA sequence of TbL-25. 
SEQ. ID NO. 29 is the DNA sequence of TbL-28. 
SEQ. ID NO. 30 is the DNA sequence of TbL-29. 
25 SEQ. ID NO. 31 is the DNA sequence of TbH-5. 

SEQ. ID NO. 32 is the DNA sequence of TbH-8. 
SEQ. ID NO. 33 is the DNA sequence of TbH-9. 
SEQ. ID NO. 34 is the DNA sequence of TbM-1 . 
SEQ. ID NO. 35 is the DNA sequence of TbM-3. 
30 SEQ. ID NO. 36 is the DNA sequence of TbM-6. 

SEQ. ID NO. 37 is the DNA sequence of TbM-7. 
SEQ. ID NO. 38 is the DNA sequence of TbM-9. 
SEQ. ID NO. 39 is the DNA sequence of TbM-12. 
SEQ. ID NO. 40 is the DNA sequence of TbM-1 3. 
35 SEQ. ID NO. 41 is the DNA sequence of TbM-14. 

SEQ. ID NO. 42 is the DNA sequence of TbM-1 5. 
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SEQ. ID NO. 43 is the DNA sequence of TbH-4. 
SEQ. ID NO. 44 is the DNA sequence of TbH^FWD. 
SEQ. ID NO. 45 is the DNA sequence of TbH-12. 
SEQ. ID NO. 46 is the DNA sequence of Tb38-1 . 
SEQ. ID NO. 47 is the DNA sequence of Tb38-4. 
SEQ. ID NO. 48 is the DNA sequence of TbL- 17. 

SEQ. ID NO. 49 is the DNA sequence of TbL-20. 

SEQ. ID NO. 50 is the DNA sequence of TbL-21. 

SEQ. ID NO. 51 is the DNA sequence of TbH-16. 

SEQ. ID NO. 52 is the DNA sequence of DPEP. 

SEQ. ID NO. 53 is the deduced amino acid sequence of DPEP. 

SEQ. ID NO. 54 is the protein sequence of DPV N-terminal Antigen. 

SEQ. ID NO. 55 is the protein sequence of AVGS N-terminal Antigen. 

SEQ. ID NO. 56 is the protein sequence of AAMK N-terminal Antigen. 

SEQ. ID NO. 57 is the protein sequence of YYWC N-terminal Antigen. 

SEQ. ID NO. 58 is the protein sequence of DIGS N-terminal Antigen. 

SEQ. ID NO. 59 is the protein sequence of AEES N-terminal Antigen. 

SEQ. ID NO. 60 is the protein sequence of DPEP N-terminal Antigen. 

SEQ. ID NO. 61 is the protein sequence of APKT N-terminal Antigen. 

SEQ. ID NO. 62 is the protein sequence of DPAS N-terminal Antigen. 

SEQ. ID NO. 63 is the deduced amino acid sequence of TbM-1 Peptide. 

SEQ. ID NO. 64 is the deduced amino acid sequence of TbRal. 

SEQ. ID NO. 65 is the deduced amino acid sequence of TbRal 0. 

SEQ. ID NO. 66 is the deduced amino acid sequence of TbRal 1. 

SEQ. ID NO. 67 is the deduced amino acid sequence of TbRal2. 

SEQ. ID NO. 68 is the deduced amino acid sequence of TbRal 3. 
SEQ. ID NO. 69 is the deduced amino acid sequence of TbRal 6. 
SEQ. ID NO. 70 is the deduced amino acid sequence of TbRal 7. 
SEQ. ID NO. 71 is the deduced amino acid sequence of TbRal 8. 
SEQ. ID NO. 72 is the deduced amino acid sequence of TbRal9. 
SEQ. ID NO. 73 is the deduced amino acid sequence of TbRa24. 
SEQ. ID NO. 74 is the deduced amino acid sequence of TbRa26. 
SEQ. ID NO. 75 is the deduced amino acid sequence of TbRa28. 
SEQ. ID NO. 76 is the deduced amino acid sequence of TbRa29. 
SEQ. ID NO. 77 is the deduced amino acid sequence of TbRa2A. 
SEQ. ID NO. 78 is the deduced amino acid sequence of TbRa3. 



SEQ. ID NO. 79 is the deduced amino acid sequence of TbRa32. 

SEQ. ID NO. 80 is the deduced amino acid sequence of TbRa35. 

SEQ. ID NO. 81 is the deduced amino acid sequence of TbRa36. 

SEQ. ID NO. 82 is the deduced amino acid sequence of TbRa4. 

SEQ. ID NO. 83 is the deduced amino acid sequence of TbRa9, 

SEQ. ID NO. 84 is the deduced amino acid sequence of TbRaB. 

SEQ. ID NO. 85 is the deduced amino acid sequence of TbRaC. 

SEQ. ID NO. 86 is the deduced amino acid sequence of TbRaD. 

SEQ. ID NO. 87 is the deduced amino acid sequence of YYWCPG. 

SEQ. ID NO. 88 is the deduced amino acid sequence of TbAAMK. 

SEQ. ID NO. 89 is the deduced amino acid sequence of Tb38-1. 

SEQ. ID NO. 90 is the deduced amino acid sequence of TbH-4. 

SEQ. ID NO. 91 is the deduced amino acid sequence of TbH-8. 

SEQ. ID NO. 92 is the deduced amino acid sequence of TbH-9. 

SEQ. ID NO. 93 is the deduced amino acid sequence of TbH-12. 

SEQ. ID NO. 94 is the DNA sequence of DPAS. 

SEQ. ID NO. 95 is the deduced amino acid sequence of DPAS. 

SEQ. ID NO. 96 is the DNA sequence of DPV. 

SEQ. ID NO. 97 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 98 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 99 is the deduced amino acid sequence of ESAT-6. 

SEQ. ID NO. 100 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 101 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 102 is the deduced amino acid sequence of TbH-9FL. 

SEQ. ID NO. 103 is the DNA sequence of TbH-9-1. 

SEQ. ID NO. 104 is the deduced amino acid sequence of TbH-9-1. 

SEQ. ID NO. 105 is the DNA sequence of TbH-9-4. 

SEQ. ID NO. 106 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 107 is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. 108 is the DNA sequence of Tb38-1F2 RP. 

SEQ. ID NO. 109 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 1 10 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 1 1 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 12 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 14 is the DNA sequence of Tb38-1F6. 
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SEQ. ID NO. 1 15 is the deduced N-terminal amino acid sequence of DPV. 
SEQ. ID NO. 1 16 is the deduced N-terminal amino acid sequence of AVGS. 
SEQ. ID NO. 1 17 is the deduced N-terminal amino acid sequence of AAMK. 
SEQ. ID NO. 118 is the deduced N-terminal amino acid sequence of YYWC. 
5 SEQ. ID NO. 1 19 is the deduced N-terminal amino acid sequence of DIGS. 

SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of AAES. 
SEQ. ID NO. 121 is the deduced N-terminal amino acid sequence of DPEP. 
SEQ. ID NO. 122 is the deduced N-teraiinal amino acid sequence of APKT. 
SEQ. ID NO. 123 is the deduced N-teraiinal amino acid sequence of DPAS. 
1 0 SEQ. ID NO. 124 is the protein sequence of DPPD N-temiinal Antigen. 

SEQ ID NO. 125-128 are the protein sequences of four DPPD cyanogen 
bromide fragments. 

SEQ ID NO. 129 is the N-terminal protein sequence of XDS antigen. 
SEQ ID NO. 130 is the N-tenninal protein sequence of AGD antigen. 
15 SEQ ID NO. 131 is the N-teraiinal protein sequence of APE antigen. 

SEQ ID NO. 132 is the N-terminal protein sequence of XYI antigen. 

Detailed Description of the Invention 

20 As noted above, the present invention is generally directed to 

compositions and methods for diagnosing tuberculosis. The compositions of the subject 
invention include polypeptides that comprise at least one antigenic portion of a 
M tuberculosis antigen, or a variant of such an antigen that differs only in conservative 
substitutions and/or modifications. Polypeptides within the scope of the present 

25 invention include, but are not limited to, soluble M. tuberculosis antigens. A "soluble 
M tuberculosis antigen" is a protein of M. tuberculosis origin that is present in 
M, tuberculosis culture filtrate. As used herein, the terai "polypeptide" encompasses 
amino acid chains of any length, including full length proteins (/.e., antigens), wherein 
the amino acid residues are linked by covalent peptide bonds. Thus, a polypeptide 

30 comprising an antigenic portion of one of the above antigens may consist entirely of the 
antigenic portion, or may contain additional sequences. The additional sequences may 
be derived from the native M tuberculosis antigen or may be heterologous, and such 
sequences may (but need not) be antigenic. 
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An "antigenic portion" of an antigen (which may or may not be soluble) 
is a portion that is capable of reacting with sera obtained from an M tuberculosis- 
infected individual {i.e., generates an absorbance reading with sera from infected 
individuals that is at least three standard deviations above the absorbance obtained with 
5 sera from uninfected individuals, in a representative ELISA assay described herein). 
An "M. tuberculosis-infected individual" is a human who has been infected with 
M tuberculosis (e.g., has an inttadermal skin test response to PPD that is at least 0.5 cm 
in diameter). Infected individuals may display symptoms of tuberculosis or may be free 
of disease symptoms. Polypeptides comprising at least an antigenic portion of one or 
10 more M. tuberculosis antigens as described herein may generally be used, alone or in 
combination, to detect tuberculosis in a patient. 

The compositions and methods of this invention also encompass variants 
of the above polypeptides. A "variant," as used herein, is a polypeptide that differs from 
the native antigen only in conservative substittitions and/or modifications, such that the 
15 antigenic properties of the polypeptide are retained. Such variants may generally be 
identified by modifying one of the above polypeptide sequences, and evaluating the 
antigenic properties of the modified polypeptide using, for example, the representative 
procedures described herein. 

A "conservative substitution" is one in which an amino acid is 
20 substituted for another amino acid that has similar properties, such that one skilled in 
the art of peptide chemistry would expect the secondaiy structure and hydropathic 
nature of the polypeptide to be substantially unchanged. In general, the following 
groups of amino acids represent conservative changes: (l)ala, pro, gly, glu, asp, gki, 
asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and 
25 (5) phe, tyr, trp, his. 

Variants may also (or alternatively) be modified by, for example, the 
deletion or addition of amino acids that have minimal influence on the antigenic 
properties, secondaiy structure and hydropathic nature of the polypeptide. For example, 
a polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end 
30 of the protein which co-translationally or post-translationally directs transfer of the 
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protein. The polypeptide may also be conjugated to a linker or other sequence for ease 
of synthesis, purification or identification of the polypeptide (e.^., poIy-His), or to 
enhance binding of the polypeptide to a solid support. For example, a polypeptide may 
be conjugated to an immunoglobulin Fc region. 
5 In a related aspect, combination polypeptides are disclosed. A 

"combination polypeptide" is a polypeptide comprising at least one of the above 
antigenic portions and one or more additional antigenic M tuberculosis sequences, 
which are joined via a peptide Imkage into a single amino acid chain. The sequences 
may be joined directly (/.e, with no intervening amino acids) or may be joined by way 
10 of a linker sequence (e.g., Gly-Cys-Gly) that does not significantly diminish the 
antigenic properties of the component polypeptides. 

In general, M tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M tuberculosis culture filtrate by procedures known to 
15 those of ordinary skill in the art, including anion-exchange and reverse phase 
chromatography. Purified antigens may then be evaluated for a desired property, such 
as the ability to react with sera obtamed from an M tuberculosis-infected individual. 
Such screens may be performed using the representative methods described herein. 
Antigens may then be partially sequenced using, for example, traditional Edman 
20 chemistry. See Edman and Berg, Eur J. Biochem, 50:1 16-132, 1967. 

Antigens may also be produced recombinant^ using a DNA sequence 
that encodes the antigen, which has been inserted into an expression vector and 
expressed in an appropriate host. DNA molecules encoding soluble antigens may be 
isolated by screening an appropriate M tuberculosis expression library with anti-sera 
25 (e.^., rabbit) raised specifically against soluble M tuberculosis antigens. DNA 
sequences encoding antigens that may or may not be soluble may be identified by 
screening an appropriate M tuberculosis genomic or cDNA expression library with sera 
obtained from patients infected with M, tuberculosis. Such screens may generally be 
performed using techniques well known in the art, such as those described in Sambrook 
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et al.. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, 
Cold Spring Harbor, NY, 1989. 

DNA sequences encoding soluble antigens may also be obtained by 
screening an appropriate M tuberculosis cDNA or genomic DNA library for DNA 
5 sequences that hybridize to degenerate oligonucleotides derived from partial amino acid 
sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in 
such a screen may be designed and synthesized, and the screen may be performed, as 
described (for example) in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratories, Cold Spring Harbor, NY (and references cited 
10 therein). Polymerase chain reaction (PCR) may also be employed, using the above 
oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 
cDNA or genomic library. The library screen may then be performed using the isolated 
probe. 

Regardless of the method of preparation, the antigens described herein 

15 are "antigenic." More specifically, the antigens have the ability to react with sera 
obtained from an M tuberculosisArdtcX^d individual. Reactivity may be evaluated 
using, for example, the representative ELISA assays described herein, where an 
absorbance reading with sera from infected individuals that is at least three standard 
deviations above the absorbance obtained with sera from uninfected individuals is 

20 corisidered positive. 

Antigenic portions of M tuberculosis antigens may be prepared and 
identified using well known techniques, such as those sxmmiarized in Paul, 
Fundamental Immunology, 3d ed.. Raven Press, 1993, pp. lA'i-lAl and references cited 
therein. Such techniques include screening polypeptide portions of the native antigen 

25 for antigenic properties. The representative ELISAs described herein may generally be 
employed in these screens. An antigenic portion of a polypeptide is a portion that, 
within such representative assays, generates a signal in such assays that is substantially 
similar to that generated by the fiill length antigen. In other words, an antigenic portion 
of a A/, tuberculosis antigen generates at least about 20%, and preferably about 100%, 

30 of the signal induced by the fiiU length antigen in a model ELISA as described herein. 
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Portions and other variants of M tuberculosis antigens may be generated 
by synthetic or recombinant means. Synthetic polypeptides having fewer than about 
100 amino acids, and generally fewer than about 50 amino acids, may be generated 
using techniques well known in the art. For example, such polypeptides may be 
5 synthesized using any of the conunercially available solid-phase techniques, such as the 
Merrifield solid-phase synthesis method, where amino acids are sequentially added to a 
grovraig amino acid chain. See Merrifield, 1 Am. Chem. Soc. 55:2149-2146, 1963. 
Equipment for automated synthesis of polypeptides is conunercially available from 
suppliers such as Applied BioSystems, Inc., Foster City, CA, and may be operated 

10 according to the manufacturer's instructions. Variants of a native antigen may generally 
be prepared using standard mutagenesis techniques, such as oligonucleotide-directed 
site-specific mutagenesis. Sections of the DNA sequence may also be removed using 
standard techniques to permit preparation of truncated polypeptides. 

Recombinant polypeptides containing portions and/or variants of a 

15 native antigen may be readily prepared from a DNA sequence encoding the polypeptide 
using a variety of techniques well known to those of ordinary skill in the art. For 
example, supematants from suitable host/vector systems which secrete recombinant 
protein into culture media may be first concentrated using a commercially available 
filter. Following concentration, the concentrate may be applied to a suitable 
20 purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or 
more reverse phase HPLC steps can be employed to fiirther purify a recombinant 
protein. 

Any of a variety of expression vectors known to those of ordinary skill in 
the art may be employed to express recombinant polypeptides as described herein. 
25 Expression may be achieved in any appropriate host cell that has been transformed or 
transfected with an expression vector containing a DNA molecule that encodes a 
recombinant polypeptide. Suitable host cells include prokaryotes, yeast and higher 
eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a mammalian 
cell line, such as COS or CHO. The DNA sequences expressed in this marmer may 
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encode naturally occurring antigens, portions of naturally occurring antigens, or other 
variants thereof. 

In general, regardless of the method of preparation, the polypeptides 
disclosed herein are prepared in substantially pure form. Preferably, the polypeptides 
are at least about 80% pure, more preferably at least about 90% pure and most 
preferably at least about 99% pure. For use in the methods described herein, however, 
such substantially pure polypeptides may be combined. 

In certain specific embodiments, the subject invention discloses 
polypeptides comprismg at least an antigenic portion of a soluble M. tuberculosis 
antigen (or a variant of such an antigen), where the antigen has one of the following N- 
terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala-Ala-Leu (SEQ ID No. 115); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser(SEQIDNo. 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-GIy-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg (SEQ ID No. 1 1 7); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Ghi-Pro-Phe-Asp-Pro-Ala-Tip-GIy- 
Pro (SEQ ID No. 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Ghi-Gln-Xaa-Ala-Val 
(SEQ ID No. 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
No. 120); 

(g) Asp-Pro-GIu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 
Pro-Pro-Ser (SEQ ID No. 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 
Gly(SEQIDNo. 122); 

(i) Asp-Pro-Ala-Ser-AIa-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln- 
Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 
Ala-Asn (SEQ ID No. 123); 
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0) Xaa-Asp-Ser-Glu-Lys-SerrAJa-Thr-IIe-Lys-Val-Thr-Asp-Ala- 
Ser;(SEQIDNo. 129) 

(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
Asp; (SEQ ID No. 130) or 

(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 131) 
wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID No. 52, the deduced 
amino acid sequence of which is provided in SEQ ID No. 53. A DNA sequence 
encoding the antigen identified as (a) above is provided in SEQ ID No. 96; its deduced 
amino acid sequence is provided m SEQ ID No. 97. A DNA sequence corresponding to 
antigen (d) above is provided in SEQ ID No. 24, a DNA sequence corresponding to 
antigen (c) is provided in SEQ ID No. 25 and a DNA sequence corresponding to antigen 
(I) is disclosed in SEQ ID No. 94 and its deduced amino acid sequence is provided in 
SEQ ID No. 95. 

In a further specific embodiment, the subject invention discloses 
polypeptides comprising at least an unmunogenic portion of an M tuberculosis antigen 
having one of the following N-termmal sequences, or a variant thereof that differs only 
in conservative substitutions and/or modifications: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-GIy-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 132) or 

(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 124) 
wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses 
polypeptides comprising at least an antigenic portion of a soluble M tuberculosis 
antigen (or a variant of such an antigen) that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID Nos. 1, 2, 4-10, 13-25, 52, 94 
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and 96, (b)the complements of such DNA sequences, or (c)DNA sequences 
substantially homologous to a sequence m (a) or (b). 

In further specific embodiments, the subject mvention discloses 
polypeptides comprising at least an antigenic portion of a M tuberculosis antigen (or a 
5 variant of such an antigen), which may or may not be soluble, that comprises one or 
more of the amino acid sequences encoded by (a) the DNA sequences of SEQ ID 
Nos. 26-51, (b)the complements of such DNA sequences or (c)DNA sequences 
substantially homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M, tuberculosis 
10 antigens include variants that are encoded DNA sequences which are substantially 
homologous to one or more of DNA sequences specifically recited herein. "Substantial 
homology," as used herein, refers to DNA sequences that are capable of hybridizing 
under moderately stringent conditions. Suitable moderately stringent conditions include 
prewashing in a solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing 
15 at 50°C-65*'C, 5X SSC, overnight or, in the event of cross-species homology, at 45*^0 
with 0.5X SSC; followed by washing twice at 65**C for 20 minutes with each of 2X, 
0.5X and 0.2X SSC containing 0.1% SDS). Such hybridizing DNA sequences are also 
within the scope of this invention, as are nucleotide sequences that, due to code 
degeneracy, encode an immunogenic polypeptide that is encoded by a hybridizing DNA 
20 sequence. 

In a related aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, a polypeptide of 
the present invention and a known M. tuberculosis antigen, such as the 38 kD antigen 
described above or ES AT-6 (SEQ ID Nos. 98 and 99), together with variants of such 

25 fusion proteins. The fusion proteins of the present invention may also include a linker 
peptide between the first and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 

30 vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 
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without a peptide linker, to the 5' end of a DNA sequence encoding the second 
polypeptide so that the reading frames of the sequences are in phase to permit mRNA 
translation of the two DNA sequences into a single fusion protein that retains the 
biological activity of both the first and the second polypeptides. 
5 A peptide linker sequence may be employed to separate the first and the 

second polypeptides by a distance sufficient to ensure that each polypeptide folds into 
its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 
the fusion protein using standard techniques well known in the art. Suitable peptide 
linker sequences may be chosen based on the following factors: (1) their ability to 
1 0 adopt a flexible extended conformation; (2) their inability to adopt a secondary structure 
that could interact with functional epitopes on the first and second polypeptides; and 
(3) the lack of hydrophobic or charged residues that might react with the polypeptide 
functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser 
residues. Other near neutral amino acids, such as Thr and Ala may also be used in the 
15 linker sequence. Amino acid sequences which may be usefully employed as linkers 
include those disclosed in Maratea etal., Gene 40:39-46, 1985; Murphy etal., Proc. 
Natl. Acad ScL USA 55:8258-8562, 1986; U.S. Patent No, 4,935,233 and U.S. Patent 
No. 4,751,180. The linker sequence may be from 1 to about 50 amino acids in length. 
Peptide linker sequences are not required when the first and second polypeptides have 
20 non-essential N-terminal amino acid regions that can be used to separate the functional 
domains and prevent steric hindrance. 

In another aspect, the present invention provides methods for using the 
polypeptides described above to diagnose tuberculosis. In this aspect, methods are 
provided for detecting M tuberculosis infection in a biological sample, using one or 
25 more of the above polypeptides, alone or in combination. In embodiments in which 
multiple polypeptides are employed, polypeptides other than those specifically 
described herein, such as the 38 kD antigen described in Andersen and Hansen, Infect, 
Immun. 57:2481-2488, 1989, may be included. As used herein, a "biological sample" is 
any antibody-containing sample obtained from a patient. Preferably, the sample is 
30 whole blood, sputum, serum, plasma, saliva, cerebrospinal fluid or urine. More 
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preferably, the sample is a blood, serum or plasma sample obtained from a patient or a 
blood supply. The polypeptide(s) are used in an assay, as described below, to determine 
the presence or absence of antibodies to the polypeptide(s) in the sample, relative to a 
predetermined cut-off value. Tlie presence of such antibodies indicates previous 
5 sensitization to mycobacteria antigens which may be indicative of tuberculosis. 

In embodiments in which more than one polypeptide is employed, the 
polypeptides used are preferably complementary (i.e., one component polypeptide wiU 
tend to detect infection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by 
0 using each polypeptide individually to evaluate serum samples obtained from a series of 
patients known to be infected with M tuberculosis. After detemiining which samples 
test positive (as described below) with each polypeptide, combinations of two or more 
polypeptides may be formulated that are capable of detecting infection in most, or aU, of 
the samples tested. Such polypeptides are complementary. For example, approximately 
; 25-30% of sera from tuberculosis-infected individuals are negative for antibodies to any 
single protein, such as the 38 kD antigen mentioned above. Complementary 
polypeptides may, therefore, be used in combination with the 38 kD antigen to improve 
sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in 
the art for using one or more polypeptides to detect antibodies in a sample. See, e.g., 
Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratoiy, 
1988, which is incorporated herein by reference. In a preferred embodiment, the assay 
involves the use of polypeptide immobilized on a solid support to bind to and remove 
the antibody from the sample. The bound antibody may then be detected usmg a 
detection reagent that contains a reporter group. Suitable detection reagents include 
antibodies that bmd to the antibody/polypeptide complex and free polypeptide labeled 
with a reporter group {e.g., in a semi-competitive assay). Alternatively, a competitive 
assay may be utilized, m which an antibody that binds to the polypeptide is labeled with 
a reporter group and allowed to bind to the immobilized antigen after incubation of the 
antigen with the sample. The extent to which components of the sample inhibit the 
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binding of the labeled antibody to the polypeptide is indicative of the reactivity of the 
sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary 
skill in the art to which the antigen may be attached. For example, the solid support 
5 may be a test well in a microtiter plate or a nitrocellulose or other suitable membrane. 
Alternatively, the support may be a bead or disc, such as glass, fiberglass, latex or a 
plastic material such as polystyrene or polyvinylchloride. The support may also be a 
magnetic particle or a fiber optic sensor, such as those disclosed, for example, in U.S. 
Patent No. 5,359,681. 

^® The polypeptides may be bound to the solid support using a variety of 

techniques known to those of ordinary skill in the art, which are amply described in the 
patent and scientific literature. In the context of the present invention, the term "bound" 
refers to both noncovalent association, such as adsorption, and covalent attachment 
(which may be a direct linkage between the antigen and functional groups on the 
1 5 support or may be a linkage by way of a cross-linking agent). Binding by adsoiption to 
a well in a microtiter plate or to a membrane is preferred. In such cases, adsorption may 
be achieved by contacting the polypeptide, in a suitable buffer, with the solid support 
for a suitable amount of time. The contact time varies with temperature, but is typically 
between about 1 hour and 1 day. In general, contactmg a well of a plastic microtiter 
20 plate (such as polystyrene or polyvinylchloride) with an amount of polypeptide ranging 
fiom about 10 ng to about 1 ^g, and preferably about 100 ng, is sufficient to bind an 
adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a bifunctional reagent that will react with 
25 both the support and a fimctional group, such as a hydroxyl or amino group, on the 
polypeptide. For example, the polypeptide may be bound to supports having an 
appropriate polymer coating using benzoquinone or by condensation of an aldehyde 
group on the support with an amine and an active hydrogen on the polypeptide (see, 
e.g.. Pierce Immunotechnology Catalog and Handbook, 1991, at A12-A13). 
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In certain embodiments, the assay is an enzyme linked immunosorbent 
assay (ELISA). This assay may be perforaied by first contacting a polypeptide antigen 
that has been immobilized on a solid suj^ort, conmionly the well of a microtiter plate, 
with the sample, such that antibodies to the polypeptide within the sample are allowed 
5 to bind to the immobilized polypeptide. Unbound sample is then removed from the 
immobilized polypeptide and a detection reagent capable of binding to the immobilized 
antibody-polypeptide complex is added. The amount of detection reagent that remains 
bound to the solid support is then determined using a method appropriate for the 
specific detection reagent. 
^® More specifically, once the polypeptide is inunobilized on the support as 

described above, the remaining protein binding sites on the support are typically 
blocked. Any suitable blocking agent known to those of ordinary skill in the art, such 
as bovine serum albumin or Tween 20™ (Sigma Chemical Co., St. Louis, MO) may be 
employed. The inmiobilized polypeptide is then incubated with the sample, and 
15 antibody is allowed to bind to the antigen. The sample may be diluted with a suitable 
diluent, such as phosphate-buffered saline (PBS) prior to incubation. In general, an 
appropriate contact time (z.e., incubation time) is that period of time that is sufficient to 
detect the presence of antibody within a M, tuberculosis-infccted sample. Preferably, 
the contact time is sufficient to achieve a level of binding that is at least 95% of that 
20 achieved at equilibrium between bound and unbound antibody. Those of ordinary skill 
in the art will recognize that the time necessary to achieve equilibrium may be readily 
determined by assaying the level of binding that occurs over a period of time. At room 
temperature, an incubation time of about 30 minutes is generally sufficient. 

Unbound sample may then be removed by washing the solid support 
25 with an appropriate buffer, such as PBS contaming 0.1% Tween 20™. Detection 
reagent may then be added to the solid support. An appropriate detection reagent is any 
compound that binds to the immobilized antibody-polypeptide complex and that can be 
detected by any of a variety of means known to those in the art. Preferably, the 
detection reagent contains a binding agent (such as, for example, Protein A, Protein G, 
30 immimoglobulin, lectin or free antigen) conjugated to a reporter group. Preferred 
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reporter groups include enzymes (such as horseradish peroxidase), substrates, cofactors, 
inhibitors, dyes, radionuclides, luminescent groups, fluorescent groups and biotin. The 
conjugation of binding agent to reporter group may be achieved using standard methods 
known to those of ordinary skill in the art. Conmion binding agents may also be 
5 purchased conjugated to a variety of reporter groups from many commercial sources 
(e.g., Zymed Laboratories, San Francisco, CA, and Pierce, Rockford, IL). 

The detection reagent is then incubated with the immobilized antibody- 
polypeptide complex for an amount of time sufficient to detect the bound antibody. An 
appropriate amount of tune may generally be deteraiined from the manufacturer's 
10 instructions or by assaying the level of binding that occurs over a period of time. 
Unbound detection reagent is then removed and bound detection reagent is detected 
using the reporter group. The method employed for detecting the reporter group 
depends upon the nature of the reporter group. For radioactive groups, scintillation 
coimting or autoradiographic methods are generally appropriate. Spectroscopic 
15 methods may be used to detect dyes, luminescent groups and fluorescent groups. Biotin 
may be detected using avidin, coupled to a different reporter group (commonly a 
radioactive or fluorescent group or an enzyme). Enzyme reporter groups may generally 
be detected by the addition of substrate (generally for a specific period of time), 
followed by spectroscopic or other analysis of the reaction products. 
20 To determine the presence or absence of anti-M tuberculosis antibodies 

in the sample, the signal detected from the reporter group that remains bound to the 
solid support is generally compared to a signal that corresponds to a predeteraiined cut- 
off value. In one preferred embodiment, the cut-off value is the average mean signal 
obtained when the immobilized antigen is incubated with samples from an uninfected 
25 patient. In general, a sample generating a signal that is three standard deviations above 
the predetermined cut-off value is considered positive for tuberculosis. In an alternate 
preferred embodiment, the cut-off value is determined using a Receiver Operator Curve, 
according to the method of Sackett et al.. Clinical Epidemiology: A Basic Science for 
Clinical Medicine, Little Brown and Co., 1985, pp. 106-107. Briefly, in this 
30 embodiment, the cut-ofif value may be determined from a plot of pairs of true positive 
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rates (i.e., sensitivity) and false positive rates (100%-specificity) that correspond to each 
possible cut-off value for the diagnostic test result. The cut-off value on the plot that is 
the closest to the upper left-hand comer (/.e., the value that encloses the largest area) is 
the most accurate cut-off value, and a sample generating a signal that is higher than the 
5 cut-off value deteimined by this method may be considered positive. Alternatively, the 
cut-off value may be shifted to the left along the plot, to minimize the false positive 
rate, or to the right, to minimize the false negative rate. In general, a sample generating 
a signal that is higher than the cut-^ff value determined by this method is considered 
positive for tuberculosis. 

In a related embodiment, the assay is performed m a rapid flow-through 
or strip test format, wherein the antigen is unmobilized on a membrane, such as 
nitrocellulose. In the flow-through test, antibodies within the sample bind to the 
immobilized polypeptide as the sample passes through the membrane. A detection 
reagent (e.g., protein A-coUoidal gold) then binds to the antibody-polypeptide complex 

15 as the solution containing the detection reagent flows through the membrane. The 
detection of bound detection reagent may then be performed as described above. In the 
strip test format, one end of the membrane to which polypeptide is bound is immersed 
in a solution containing the sample. The sample migrates along the membrane through 
a region containing detection reagent and to the area of immobilized polypeptide. 

20 Concentration of detection reagent at the polypeptide indicates the presence of anti- 
M. tuberculosis antibodies in the sample. Typically, the concentration of detection 
reagent at that site generates a pattern, such as a line, that can be read visually. The 
absence of such a pattern indicates a negative result. In general, the amount of 
polypeptide immobilized on the membrane is selected to generate a visually discernible 

25 pattern when the biological sample contains a level of antibodies that would be 
sufficient to generate a positive signal in an ELIS A, as discussed above. Preferably, the 
amount of polypeptide immobilized on the membrane ranges from about 25 ng to about 
1 ^g, and more preferably from about 50 ng to about 500 ng. Such tests can typically 
be performed with a very small amount (e.g., one drop) of patient serum or blood. 
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Of course, numerous other assay protocols exist that are suitable for use 
with the polypeptides of the present invention. TTie above descriptions are intended to 
be exemplary only. 

In yet another aspect, the present invention provides antibodies to the 
5 inventive polypeptides. Antibodies may be prepared by any of a variety of techniques 
known to those of ordinary skill in the art. See, e.g., Harlow and Lane, Antibodies: A 
Laboratory Manual, Cold Spring Harbor Laboratory. 1988. In one such technique, an 
immunogen comprising the antigenic polypeptide is initially injected into any of a Jide 
variety of mammals {e.g., mice, rats, rabbits, sheep and goats). In this step, the 
10 polypeptides of this invention may serve as die immunogen without modification. 
Alternatively, particularly for relatively short polypeptides, a superior immune response 
may be elicited if the polypeptide is joined to a carrier protein, such as bovine serum 
albumin or keyhole limpet hemocyanin. ITie immunogen is injected into the animal 
host, preferably according to a predetemiined schedule incorporating one or more 
15 booster immunizations, and the animals are bled periodically. Polyclonal antibodies 
specific for tiie polypeptide may tiien be purified from such antisera by. for example, 
affinity chromatography using the polypeptide coupled to a suitable solid support. 

Monoclonal antibodies specific for tiie antigenic polypeptide of interest 
may be prepared, for example, using Uie technique of Kohler and Milstein, Eur. J. 
20 Immunol. (5:51 1-519, 1976. and improvements tiiereto. Briefly. ti:ese methods' involve 
the preparation of immortal cell lines capable of producing antibodies having tiie 
desired specificity {i.e., reactivity witii tiie polypeptide of interest). Such cell lines may 
be produced, for example, from spleen cells obtained from an animal immunized as 
described above. The spleen cells are tiien immortalized by, for example, fiision witii a 
25 myeloma cell fiision partner, preferably one tiiat is syngeneic witii tiie immunized 
animal. A variety of fiision techniques may be employed. For example, tiie spleen cells 
and myeloma cells may be combined witii a nonionic detergent for a few minutes and 
tiien plated at low density on a selective medium tiiat supports tiie growtii of hybrid 
cells, but not myeloma cells. A preferred selection technique uses HAT (hypoxantiiine. 
30 aminopterin, tiiymidine) selection. After a sufficient time, usually about 1 to 2 weeks. 
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colonies of hybrids are observed. Single colonies are selected and tested for binding 
activity against the polypeptide. Hybridomas having high reactivity and specificity are 
preferred. 

Monoclonal antibodies may be isolated fi-om the supcmatants of growing 
5 hybridoma colonies. In addition, various techniques may be employed to enhance the 
yield, such as injection of the hybridoma cell Ime into the peritoneal cavity of a suitable 
vertebrate host, such as a mouse. Monoclonal antibodies may then be harvested from 
the ascites fluid or the blood. Contammants may be removed from the antibodies by 
conventional techniques, such as chromatography, gel filtration, precipitation, and 
10 extraction. The polypeptides of this invention may be used in the purification process 
in, for example, an affinity chromatography step. 

Antibodies may be used in diagnostic tests to detect the presence of 
M tuberculosis antigens using assays similar to those detailed above and other 
techniques well known to those of skill in the art, thereby providing a method for 
1 5 detecting M. tuberculosis infection in a patient. 

Diagnostic reagents of the present invention may also comprise DNA 
sequences encoding one or more of the above polypeptides, or one or more portions 
thereof For example, primers comprising at least 10 contiguous oligonucleotides of the 
subject DNA sequences may be used in polymerase chain reaction (PGR) based tests, 
20 Similarly, probes comprising at least 15 contiguous oligonucleotides of the subject 
DNA sequences may be used for hybridizing to specific sequences. Techniques for 
both PGR based tests and hybridization tests are well known in the art. Primers or 
probes may thus be used to detect M tuberculosis infection in biological samples, 
preferably sputum, blood, serum, saliva, cerebrospinal fluid or urine. DNA probes or 
25 primers comprising oligonucleotide sequences described above may be used alone, in 
combination with each other, or with previously identified sequences, such as the 38 kD 
antigen discussed above. 



The following Examples are offered by way of illustration and not by 
3 0 way of limitation. 
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EXAMPLES 
EXAMPLE 1 

Purificati on and Characterization of Poi.ypepttdfs 

FROM M. TUBERCULOSIS CULTURE FiLTR ATF 



This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the 
1 0 following example are weight per volume. 

M tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37T for fourteen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 filter into a 
sterile 2.5 L bottle. The media was then filtered through a 0.2 \i filter into a sterile 4 L 
15 bottle. NaNj was then added to the culture filtrate to a concentration of 0.04%. The 
bottles were then placed in a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
which had been rinsed with ethanol and contained a 10,000 kDa MWCO membrane. 
20 The pressure was maintained at 60 psi using nitrogen gas. This procedure reduced the 
12 L volume to approximately 50 ml. 

The culture filtrate was then dialyzed into 0.1% ammonium bicarbonate 
using a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium 
bicarbonate -solution. Protein concentration was then determined by a commercially 
25 available BCA assay (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were tiien dialyzed against 0.01 mM 
1,3 bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), 
the initial conditions for anion exchange chromatography. Fractionation was performed 
30 using gel profusion chromatography on a POROS 146 II Q/M anion exchange column 
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4.6 mm x 100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM 
Bis-Tris propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl 
gradient^ the above buffer system. The column eluent was monitored at a wavelength 
of 220 nm. 

5 The pools of polypeptides eluting from the ion exchange colimm were 

dialyzed against distilled water and lyophilized. The resulting material was dissolved in 
0.1% trifluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were purified on 
a Delta-Pak CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron 
particle size (3.9 x 150 mm). The polypeptides were eluted from the column with a 

10 linear gradient from 0-60% dilution buffer (0.1% TFA in acetonitrile). The flow rate 
was 0.75 ml/minute and the HPLC eluent was monitored at 214 nm. Fractions 
containing the eluted polypeptides were collected to maximize the purity of the 
individual samples. Approximately 200 purified polypeptides were obtained. 

The purified polypeptides were then screened for the ability to induce T- 

15 cell proliferation in PBMC preparations. The PBMCs from donors known to be PPD 
skin test positive and whose T cells were shown to proliferate in response to PPD and 
crude soluble proteins from MTB were cultured in medium comprismg RPMI 1640 
supplemented with 10% pooled human serum and 50|ig/ml gentamicin. Purified 
polypeptides were added in duplicate at concentrations of 0.5 to 10 ^ig/niL, After six 

20 days of culture in 96-welI round-bottom plates in a volume of 200 ul, 50 jal of medium 
was removed from each well for determination of IFN-y levels, as described below. 
The plates were then pulsed with 1 |iCi/well of tritiated thymidine for a further 18 
hours, harvested and tritium uptake determined using a gas scintillation counter. 
Fractions that resulted in proliferation in both replicates three fold greater than the 

25 proliferation observed in cells cultured in medium alone were considered positive. 

IFN-Y was measxired using an enzyme-linked immunosorbent assay 
(ELISA). ELISA plates were coated with a mouse monoclonal antibody directed to 
human IFN-y (Chemicon) in PBS for four hours at room temperature. Wells were then 
blocked with PBS containing 5% (WAO non-fat dried milk for 1 hour at room 

30 temperature. The plates were then washed six times in PBS/0.2% TWEEN-20 and 
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samples diluted 1:2 in culture medium in the ELISA plates were incubated ovemight at 
room temperature. The plates were again washed and a polyclonal rabbit anti-human 
IFN-y serum diluted 1:3000 in PBS/10% normal goat serum was added to each well. 
The plates were then incubated for two hours at room temperature, washed and 
5 horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was added at a 1:2000 
dilution in PBS/5% non-fat dried milk. After a further two hour incubation at room 
temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 min with 1 N sulfiiric acid. Optical density was determined at 450 nm 
using 570 nm as a reference wavelength. Fractions that resulted m both replicates 

10 giving an OD two fold greater than the mean OD from cells cultured in medium alone, 
plus 3 standard deviations, were considered positive. 

For sequencing, the polypeptides were individually dried onto 
Biobrene™ (Perkin Ehner/Applied BioSystems Division, Foster City, CA) treated glass 
fiber filters. The filters with polypeptide were loaded onto a Perkin Ehner/Applied 

15 BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced 
from the amino terminal and using traditional Edman chemistry. The amino acid 
sequence was determined for each polypeptide by comparing the retention time of the 
PTH amino acid derivative to the appropriate PTH derivative standards. 

Using the procedure described above, antigens having the following 

20 N-terminal sequences were isolated: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly- 
Ghi-Val-Val-Ala-Ala-Leu (SEQ ID No, 54); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser(SEQ ID No. 55); 

25 (c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 

Ala-Lys-Glu-Gly-Arg (SEQ ID No. 56); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
Pro (SEQ ID No. 57); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val 
30 (SEQ ID No. 58); 
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(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
No. 59); 

(g) Asp-Pro-GIu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-AIa-AIa- 
Pro-Pro-Ala (SEQ ID No, 60); and 

5 (h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly (SEQ ID No. 61); 
wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC 
purification step in addition to the procedure described above. Specifically, 20 |li1 of a 

10 fi-action comprising a mixture of antigens from the chromatographic purification step 
previously described, was purified on an Aquapore C18 colunm (Perkin Ehner/Applied 
Biosystems Division, Foster City, CA) with a 7 micron pore size, column size 1 mm x 
100 mm, in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions 
were eluted from the column with a linear gradient of 1%/minute of acetonitrile 

15 (containing 0.05% TFA) in water (0.05% TFA) at a flow rate of 80 ^1/minute. The 
eluent was monitored at 250 nm. The original firaction was separated into 4 major peaks 
plus other smaller components and a polypeptide was obtained which was shown to 
have a molecular weight of 12.054 Kd (by mass spectrometry) and the following N- 
terminal sequence: 

20 (i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-AIa-Ala-Gln-Gln- 

Thr-Ser-Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe- 
Ala-Asp (SEQ ID No. 62). 
This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

25 Additional soluble antigens were isolated fi-om M tuberculosis culture 

filtrate as follows. M tuberculosis culture filtrate was prepared as described above. 
Following dialysis against Bis-Tris propane buffer, at pH 5.5, fi-actionation was 
performed using anion exchange chromatography on a Poros QE column 4.6 x 100 mm 
(Perseptive Biosystems) equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides 
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were eluted with a linear 0-1.5 M NaCl gradient in the above buffer system at a flow 
rate of 10 ml/min. The column eluent was monitored at a wavelength of 214 nm. 

The fractions eluting from the ion exchange column were pooled and 
subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
5 (Perseptive Biosystems), Polypeptides were eluted from the column with a linear 
gradient from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eluent 
was monitored at 214 nm. 

Fractions containing the eluted polypeptides were lyophilized and 
resuspended in 80 ^il of aqueous 0.1% TFA and fiirther subjected to reverse phase 
10 chromatography on a Vydac C4 colunm 4.6 x 150 mm (Western Analytical, Temecula, 
CA) with a linear gradient of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 
ml/min. Eluent was monitored at 214 nm. 

The fraction with biological activity was separated into one major peak 
plus other smaller components. Western blot of this peak onto PVDF membrane 
15 revealed three major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These 
polypeptides were determined to have the following N-terminal sequences, respectively: 

0) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 
Ser;(SEQIDNo. 129) 

(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
20 Asp; (SEQ ID No, 130) and 

(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Ghi-Ala- 
Gly; (SEQ ID No. 131), wherein Xaa may be any amino acid. 
Using the assays described above, these polypeptides were shown to induce 
proliferation and IFN-y production in PBMC preparations. Figs. lA and B show the 
25 results of such assays using PBMC preparations from a first and a second donor, 
respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and 
(g) above were obtained by screening a A/, tuberculosis genomic library using "P end 
labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 
30 containing M. tuberculosis codon bias. The screen performed using a probe 
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corresponding to antigen (a) above identified a clone having the sequence provided in 
SEQ ID No. 96. The polypeptide encoded by SEQ ID No. 96 is provided in SEQ ID 
No. 97. The screen performed using a probe corresponding to antigen (g) above 
identified a clone having the sequence provided in SEQ ID No. 52. The polypeptide 
5 encoded by SEQ ID No. 52 is provided in SEQ ID No. 53 . The screen performed using 
a probe corresponding to antigen (d) above identified a clone having the sequence 
provided in SEQ ID No. 24, and the screen performed with a probe corresponding to 
antigen (c) identified a clone having the sequence provided in SEQ ID No. 25. 

The above amino acid sequences were compared to known amino acid 
10 sequences in the gene bank using the DNA STAR system. The database searched 
contains some 173,000 proteins and is a combmation of the Swiss, PIR databases along 
with translated protein sequences (Version 87). No significant homologies to the amino 
acid sequences for antigens (a)-(h) and (1) were detected. 

The amino acid sequence for antigen (i) was found to be homologous to 
15 a sequence from M. leprae. The full length M. leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. This sequence was then 
used to screen an M. tuberculosis library and a fiill length copy of the M. tuberculosis 
homologue was obtained (SEQ ID No. 94). • 

The amino acid sequence for antigen 0) was found to be homologous to 
20 a known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to 
a sequence from M. leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
25 positive donors, the results for representative antigens provided above are presented in 
Table 1: 
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TABLE 1 

Results of PBMC Proliferation and IFN-v Assays 



Sequence 


Proliferation 


IFN-y 


(a) 


+ 




(c) 


+++ 


+4+ 


(d) 


++ 


++ 


(g) 


+++ 


+++ 


(h) 


-hH- 


+++ 



5 In Table 1, responses that gave a stimulation index (SI) of between 2 and 

4 (compared to cells cultured in medium alone) were scored as +, as SI of 4-8 or 2-4 at a 
concentration of 1 ng or less was scored as -h- and an SI of greater than 8 was scored as 
H-H-. The antigen of sequence (i) was found to have a high SI (-HH-) for one donor and 
lower SI (-H- and +) for the two other donors in both proliferation and IFN-y assays. 
10 These results indicate that these antigens are capable of inducing proliferation and/or 
interferon-y production. 

EXAMPLE 2 

USE OF PATIENT SERA TO ISOLATE M TUBERCULOSIS ANTIGENS 

15 

This example illustrates the isolation of antigens from M tuberculosis 
lysate by screening with serum from M tuberculosis-mfQctcd individuals. 

Dessicated M tuberculosis H37Ra (Difco Laboratories) was added to a 
2% NP40 solution, and alternately homogenized and sonicated three times. The 
20 resulting suspension was centrifuged at 13,000 rpm in microfuge tubes and the 
supernatant put through a 0.2 micron syringe filter. The filtrate was boimd to Macro 
Prep DEAE beads (BioRad, Hercules, CA). The beads were extensively washed with 
20 mM Tris pH 7.5 and bound proteins eluted with IM NaCl. The NaCl elute was 
dialyzed overnight against 10 mM Tris, pH 7.5. Dialyzed solution was treated with 
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DNase and RNase at 0.05 mg/ml for 30 min. at room temperature and then with a-D- 
mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room temperature. After returning to 
pH 7.5, the material was fractionated via FPLC over a Bio Scale-Q-20 column 
(BioRad). Fractions were combined into nine pools, concentrated in a Centriprep 10 
5 (Amicon, Beverley, MA) and screened by Western blot for serological activity using a 
serum pool from M tuberculosis-mf^oXtd patients which was not immunoreactive with 
other antigens of the present invention. 

The most reactive fraction was run in SDS-PAGE and transferred to 
PVDF, A band at approximately 85 Kd was cut out yielding the sequence: 

1 0 (m) Xaa.Tyr.Ile.Ala-Tyr-Xaa-Thr-Thr.Ala.Gly.Ile.Val.Pro-Gly-Lys- 

Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 132), wherein Xaa may 
be any amino acid. 
Comparison of this sequence with those in the gene bank as described 
above, revealed no significant homologies to known sequences. 

15 

EXAMPLE 3 

Preparation of DNA Sequences Encoding M tuberculosis Aktigehs 

This example illustrates the. preparation of DNA sequences encoding 
20 M. tuberculosis antigens by screening a M tuberculosis expression library with sera 
obtained from patients infected with M tuberculosis, or with anti-sera raised against 
M, tuberculosis antigens. 

A. Preparation of M. tuberculosis Soluble Antigens using R.ABBrr Anti- 

25 SERA 

Genomic DNA was isolated from the M tuberculosis strain H37Ra. The 
DNA was randomly sheared and used to construct an expression library using the 
Lambda ZAP expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was 
generated against secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and 
30 Erdman by immunizing a rabbit with concentrated supernatant of the M tuberculosis 
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cultures. Specifically, the rabbit was first immunized subcutaneoiisly with 200 of 
protein antigen in a total volume of 2 ml containing 100 ^g muramyl dipeptide 
(Calbiochem, La JoIIa, CA) and 1 ml of incomplete Freund's adjuvant. Four weeks later 
the rabbit was boosted subcutaneously with 100 ^ig antigen in incomplete Freund's 
5 adjuvant FinaUy, the rabbit was immunized intravenously four weeks later with 50 jtg 
protein antigen. The anti-sera were used to screen the expression library as described in 
Sambrook etal.. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories. Cold Spring Harbor, NY, 1989. Bacteriophage plaques expressing 
immunoreactive antigens were purified. Phagemid from the plaques was rescued and 
1 0 the nucleotide sequences of the M. tuberculosis clones deduced. 

Thirty two clones were purified. Of these, 25 represent sequences that 
have not been previously identified in M. tuberculosis. Proteins were induced by IPTG 
and purified by gel elution, as described in Skeiky etal., J. Exp. Med. 757:1527-1537, 
1995. Representative partial sequences of DNA molecules identified in this screen are 
15 provided in SEQ ID Nos. 1-25. The coiresponding predicted amino acid sequences are 
shown in SEQ ID Nos. 64-88. 

On comparison of these sequences with known seqiiences in the gene 
bank using the databases described above, it was found that the clones referred to 
hereinafter as TbRA2A, TbRAI6, TbRA18, and TbRA29 (SEQ ID Nos. 77, 69, 71. 76) 

20 show some homology to sequences previously identified in Mycobacterium leprae but 
not in M. tuberculosis. TbRAll. TbRA26, TbRA28 and TbDPEP (SEQ ID Nos. 66. 
74, 75, 53) have been previously identified in M. tuberculosis. No significant 
homologies were found to TbRAl, TbRA3, n)RA4, TbRA9, TbRAlO, TbRA13, 
TbRAI7. TbRAI9, TbRA29, TbRA32, TbRA36 and the overlapping clones TbRA35 

25 and TbRA12 (SEQ ID Nos. 64, 78. 82. 83, 65, 68. 76. 72. 76, 79. 81, 80, 67, 
respectively). The clone TbRa24 is overiapping with clone TbRa29. 
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B. Use of Patient Sera to Identify DNA Sequences Encoding 

M, TUBERCULOSIS AHTJGEN^ 

The genomic DNA library described above, and an additional H37Rv 
library, were screened using pools of sera obtained from patients with active 
5 tuberculosis. To prepare the H37Rv library, M tuberculosis strain H37Rv genomic 
DNA was isolated, subjected to partial Sau3A digestion and used to construct an 
expression library using the Lambda Zap expression system (Stratagene, La JoUa, Ca). 
Three different pools of sera, each containing sera obtained from three individuals with 
active pulmonary or pleural disease, were used in the expression screening. The pools 

10 were designated TbL, TbM and TbH, referring to relative reactivity with H37Ra lysate 
{i.e., TbL = low reactivity, TbM = medium reactivity and TbH = high reactivity) in both 
ELISA and immunoblot format. A fourth pool of sera from seven patients with active 
pxilmonary tuberculosis was also employed. All of the sera lacked increased reactivity 
with the recombinant 38 kD M tuberculosis H37Ra phosphate-binding protein. 

15 All pools were pre-adsorbed with £. coli lysate and used to screen the 

H37Ra and H37Rv expression libraries, as described in Sambrook et al.. Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, 
NY, 1989. Bacteriophage plaques expressing immimoreactive antigens were purified. 
Phagemid from the plaques was rescued and the nucleotide sequences of the 

20 M tuberculosis clones deduced. 

Thirty two clones were purified. Of these, 31 represented sequences that 
had not been previously identified in human M, tuberculosis. Representative sequences 
of the DNA molecules identified are provided in SEQ ID NOS.: 26-51 and 100. Of 
these, TbH-8 and TbH-8-2 (SEQ. ID NO. 100) are non-contiguous DNA sequences 

25 from the same clone, and TbH-4 (SEQ. ID NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) 
are non-contiguous sequences from the same clone. Amino acid sequences for the 
antigens hereinafter identified as Tb38-1, TbH-4, TbH-8, TbH-9, and TbH-12 are 
shown in SEQ ID NOS.: 89-93. Comparison of these sequences with known sequences 
in the gene bank using the databases identified above revealed no significant 

30 homologies to TbH-4, TbH-8, TbH-9 and TbM-3, although weak homologies were 
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found to TbH-9. TbH-12 was found to be homologous to a 34 kD antigenic protein 
previously identified in M. paratuberculosis (Acc. No. S28515). Tb38-1 was found to 
be located 34 base pairs upstream of the open reading frame for the antigen ESAT-6 
previously identified in M. bovis (Acc. No. U34848) and in M tuberculosis (Sorensen 
5 ttal., Infec. Immun. 63:1710-1717,1995). 

Probes derived from Tb38-I and TbH-9, both isolated from an H37Ra 
Ubrary, were used to identify clones in an H37Rv library. Tb38-1 hybridized to 
Tb38-1F2, Tb38-1F3, Tb38-1F5 and Tb38-1F6 (SEQ. ID NOS. 107, 108. 111, 1 13, and 
114). (SEQ ID NOS, 107 and 108 are non-contiguous sequences from clone Tb38- 

10 1F2.) Two open reading frames were deduced in Tb38-IF2; one corresponds to Tb37FL 
(SEQ. ID. NO. 109), the second, a partial sequence, may be the homologue of Tb38-1 
and is called Tb38-IN (SEQ. ID NO. 1 10). The deduced amino acid sequence of Tb38- 
1F3 is presented in SEQ. ID. NO. 112. A TbH-9 probe identified three clones in the 
H37RV library: TbH-9-FL (SEQ. ID NO. 101), which may be the homologue of TbH-9 

15 (R37Ra). TbH-9-1 (SEQ. ID NO. 103), and TbH-9-4 (SEQ. ID NO. 105), all of which 
are highly related sequences to TbH-9. The deduced amino acid sequences for these 
three clones are presented in SEQ ID NOS. 102, 104 and 106. 



20 EXAMPLE 4 

Purification Am Characte rization of a Polypeptide from TimpRcuLrN PtrerFrpn 

Protein Derivative 

An M. tuberculosis polypeptide was isolated from tuberculin purified 
25 protein derivative (PPD) as follows. 

PPD was prepared as published with some modification (Seibert, F. et 
al.. Tuberculin purified protein derivative. Preparation and analyses of a large quantity 
for standard. The American Review of Tuberculosis 44:9-25. 1 941) jj^ 
tuberculosis Rv strain was grown for 6 weeks in synthetic medium in roller bottles at 37 
30 "C. Bottles containing the bacterial growth were then heated to 100°C in water vapor 
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for 3 hours. Cultures were sterile filtered using a 0.22 ii filter and the Uquid phase was 
concentrated 20 times using a 3 kD cut-ofiF membrane. Proteins were precipitated once 
with 50% ammonium sulfate solution and eight times with 25% ammonium sulfate 
solution. The resulting protems (PPD) were fractionated by reverse phase liquid 
5 chromatography (RP-HPLC) using a C18 column (7.8 x 300 mM; Waters, Milford, 
MA) m a Biocad HPLC system (Perseptive Biosystems, Framingham, MA). Fractions 
were eluted from the column with a linear gradi ;nt from 0-100% buffer (0.1% TFA in 
acetonitrile). The flow rate was 10 ml/minute and eluent was monitored at 214 nm and 
280 nm. 

^® Six fractions were collected, dried, suspended in PBS and tested 

individually in M. tuberculosis-mfected guinea pigs for induction of delayed type 
hypersensitivity (DTH) reaction. One fraction was found to induce a strong DTH 
reaction and was subsequently fractionated fiirtherby RP-HPLC on a microbore Vydac 
CIS column (Cat. No. 2I8TP5115) in a Perkin Ehner/Applied Biosystems Division 
15 Model 172 HPLC. Fractions were eluted with a linear gradient from 5-100% buffer 
(0.05% TFA in acetonitrile) with a flow rate of 80 ^l/minute. Eluent was monitored at 
215 nm. Eight fractions were collected and tested for induction of DTH in M. 
tuberculosis-infected guinea pigs. One fraction was found to induce strong DTH of 
about 16 mm induration. The other fractions did not induce detectable DTH. The 
20 positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain 
a single protem band of approximately 12 kD molecular weight. 

This polypeptide, herein after referred to as DPPD, was sequenced from 
the amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 
protein sequencer as described above and found to have the N-temiinal sequence shown 
25 in SEQ ID No.: 124. Comparison of this sequence with known sequences in the gene 
bank as described above revealed no known homologies. Four cyanogen bromide 
fragments of DPPD were isolated and found to have the sequences shown in SEQ ID 
Nos.: 125-128. 



30 
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EXAMPLES 
Synthesis of Synthetic Polypeptides 

Polypeptides may be synthesized on a Millipore 9050 peptide 
5 synthesizer using FMOC chemistry with HPTU (0-Benzotriazole-N,N,>r,N'- 
tetramethyluronium hexafluorophosphate) activation. A Gly-Cys-GIy sequence may be 
attached to the amino terminus of the peptide to provide a method of conjugation or 
labeling of the peptide. Cleavage of the peptides from the solid support may be carried 
out using the following cleavage mixture: trifluoroacetic 

10 acid:ethanedithioI:thioanisole:water:phenol (40:1 :2:2:3). After cleaving for 2 hours, the 
peptides may be precipitated in cold methyl-t-butyl-ether. The peptide pellets may then 
be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and lyophilized prior 
to purification by CI 8 reverse phase HPLC. A gradient of 0-60% acetonitrile 
(containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

15 peptides. Following lyophilization of the pure fractions, the peptides may be 
characterized using electrospray mass spectrometry and by amino acid analysis. 

This procedure was used to synthesize a TbM-1 peptide that contains one 
and a half repeats of a TbM-1 sequence. The TbM-1 peptide has the sequence 
GCGDRSGGNLDQIRLRRDRSGGNL (SEQ ID No. 63). 

20 

EXAMPLE 6 

Use of Representative Antigens for Serodiagnosis of Tuberculosis 



25 This Example illustrates the diagnostic properties of several 

representative antigens. Figures 1 and 2 present the reactivity of representative antigens 
with sera from M tuberculosis-infected and uninfected individuals, as compared to the 
reactivity of bacterial lysate and the 38 kD antigen. 

Assays were performed in 96-well plates were coated with 200 ng 

30 antigen diluted to 50 ^iL in carbonate coating buffer, pH 9.6. The wells were coated 
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overnight at 4°C (or 2 hours at 37°C). The plate contents were then removed and the 
wells were blocked for 2 hours with 200 iiL of PBS/1% BSA. After the blocking step, 
the wells were washed five times with PBS/0.1% Tween 20™. 50 ^iL sera, diluted 
1:100 in PBS/0.1% Tween 20™/0.I% BSA, was then added to each well and incubated 
5 for 30 minutes at room temperature. The plates were then washed again five times with 
PBS/0.1% Tween 20™. 

The enzyme conjugate (horseradish peroxidase - Protein A, Zymed, San 
Francisco, CA) was then diluted 1:10,000 in PBS/0.1% Tween 20™/0.1% BSA, and 
50 jiL of the diluted conjugate was added to each well and incubated for 30 minutes at 

10 room temperature. Following incubation, the wells were washed five times with 
PBS/0.1% Tween 20™. 100^L of tetramethylbenzidine peroxidase (TMB) substrate 
(Kirkegaard and Perry Laboratories, Gaithersburg, MD) was added, undiluted, and 
incubated for about 15 minutes. The reaction was stopped with the addition of 100 liL 
of 1 N H2SO4 to each well, and the plates were read at 450 nm. 

15 Figure 2 shows the ELISA reactivity of two recombinant antigens 

isolated using method A in Example 3 (TbRa3 and TbRa9) with sera fi-om 
M, tuberculosis positive and negative patients. The reactivity of these antigens is 
compared to that of bacterial lysate isolated firom M tuberculosis strain H37Ra (Difco, 
Detroit, MI). In both cases, the recombinant antigens differentiated positive fi-om 

20 negative sera. Based on cut-off values obtained from receiver-operator curves, TbRa3 
detected 56 out of 87 positive sera, and TbRa9 detected 111 out of 1 65 positive sera. 

Figure 3 illustrates the ELISA reactivity of representative antigens 
isolated using method B of Example 3. The reactivity of the recombinant antigens 
TbH4, TbH12, Tb38-1 and the peptide TbM-1 (as described in Example 4) is compared 

25 to that of the 38 kD antigen described by Andersen and Hansen, Infect, Jmmun. 
57:2481-2488, 1989. Again, all of the polypeptides tested differentiated positive from 
negative sera. Based on cut-off values obtained from receiver-operator curves, TbH4 
detected 67 out of 126 positive sera, TbH12 detected 50 out of 125 positive sera, 38-1 
detected 61 out of 101 positive sera and the TbM-1 peptide detected 25 out of 30 

30 positive sera. 
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The reactivity of four antigens (TbRa3, TbRa9, TbH4 and TbH12) with 
sera from a group of M. tuberculosis infected patients with differing reactivity in the 
acid fast stain of sputum (Smithwick and David, Tubercle 52:226, 1971) was also 
examined, and compared to the reactivity of M. tuberculosis lysate and the 38 kD 
5 antigen. The results are presented in Table 2, below: 



TABLE 2 

Reactivity of Antigens with Sera from M. tuberculosis PATitnsrr<: 





Acid 
Fast 

oputuin 


ELISA Values 1 


Lysate 38kD TbRa9 TbH12 TbH4 TbRa3 1 




MM 


1.853 


0.634 


0.998 


1.022 


1.030 


1.314 


T>*niTlO'JT lO 

1 ouiijyji-iy 


1 1 1 ■ 
1 1 1 1 


2.657 


2.322 


0.608 


0.837 


1.857 


2.335 


1 DUloyjl-6 


1 1 1 


2.703 


0.527 


0.492 


0.281 


0.501 


2.002 


1DO1B93I-10 


+++ 


1.665 


1.301 


0.685 


0.216 


0.448 


0.458 


Tb01B93I-ll 


1 1 1 


2.817 


0.697 


0.509 


0.301 


0.173 


2.608 


Tb01B93I-15 


4-H- 


1.28 


0.283 


0.808 


0.218 


1.537 


0.811 


Tb01B93I-16 


+++ 


2.908 


>3 


0.899 


0.441 


0.593 


1.080 


Tb01B93I-25 


+++ 


0.395 


0.131 


0.335 


0.211 


0.107 


0.948 


Tb01B93I.87 


+++ 


2.653 


2.432 


2.282 


0.977 


1.221 


0.857 


Tb01B93I-89 


+-f+ 


1.912 


2.370 


2.436 


0.876 


0.520 


0.952 


Tb01B94M08 


+++ 


1.639 


0.341 


0.797 


0.368 


0.654 


0.798 


Tb01B94I-201 


+++ 


1.721 


0.419 


0.661 


0.137 


0.064 


0.692 


TbOIB93I.88 


++ 


1.939 


1.269 


2.519 


1.381 


0.214 


0.530 


Tb01B93I-92 


++ 


2.355 


2.329 


2.78 


0.685 


0.997 


2.527 


Tb01B94I-I09 


++ 


0.993 


0.620 


0.574 


0.441 


0.5 


2.558 
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Based on cut-ofF values obtained from receiver-operator curves, TbRa3 
detected 23 out of 27 positive sera, TbRa9 detected 22 out of 27, TbH4 detected 18 out 
of 27 and TbH12 detected 15 out of 27. If used in combination, these four antigens 
5 would have a theoretical sensitivity of 27 out of 27, indicating that these antigens 
should complement each other in the serological detection of M tuberculosis infection. 
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In addition, several of the recombinant antigens detected positive sera that were not 
detected using the 38 kD antigen, indicating that these antigens may be complementary 
to the 38 kD antigen. 

The reactivity of the recombinant antigen TbRall with sera from 
5 M. tuberculosis patients shown to be negative for the 38 kD antigen, as well as with sera 
from PPD positive and normal donors, was determined by ELISA as described above. 
The results are shown in Figure 4 which indicates that TbRal 1, while being negative 
Avith sera from PPD positive and normal donors, detected sera that were negative with 
the 38 kD antigen. Of the thirteen 38 kD negative sera tested, nine were positive with 
1 0 TbRal 1 , indicating that this antigen may be reacting with a sub-group of 38 kD antigen 
negative sera. In contrast, in a group of 38 kD positive sera where TbRal 1 was 
reactive, the mean OD 450 for TbRal 1 was lower than that for die 38 kD antigen. The 
data indicate an inverse relationship between the presence of TbRal 1 activity and 38 kD 
positivity. 

The antigen TbRa2A was tested in an indirect ELISA using initially 
50 (a1 of serum at 1:100 dilution for 30 minutes at room temperature followed by 
washing in PBS Tween and incubating for 30 minutes with biotinylated Protein A 
(Zymed, San Francisco, CA) at a 1:10,000 dilution. Following washing, 50^1 of 
streptavidin-horseradish peroxidase (Zymed) at 1:10,000 dilution was added and the 

20 mixture incubated for 30 minutes. After washing, the assay was developed with TMB 
substrate as described above. The reactivity of TbRa2A with sera from M. tuberculosis 
patients and normal donors in shown in Table 3. The mean value for reactivity of 
TbRa2A with sera from M tuberculosis patients was 0.444 with a standard deviation of 
0.309. The mean for reactivity with sera from normal donors was 0. 1 09 with a standard 

25 deviation of 0.029. Testing of 38 kD negative sera (Figure 5) also indicated that the 
TbRa2A antigen was capable of detecting sera in this category. 
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TABLE 3 

REAcnviTY OF TbRa2A with sera from M. tuberculosis Patients and from 

Normal Donors 



Serum ID 
Tb85 
Tb86 
Tb87 
Tb88 
Tb89 
Tb91 
Tb92 
Tb93 
Tb94 
Tb95 
Tb96 
Tb97 
Tb99 
TblOO 
TblOl 
Tbl02 
Tbl03 
Tbl07 
Tbl09 
Tbll2 
DL4-0176 



Status 
TB 
TB 
TB 
TB 
TB 
TB 
TB 
TB 
TB 
TB 

TB 

TB 

TB 

TB 

TB 

TB 

TB 

TB 

TB 

TB 
Normal 



OP 450 
0.680 
0.450 
0.263 
0.275 
0.403 
0.393 

0.401 

0.232 

0.333 

0.435 

0.284 

0.320 

0.328 

0.817 

0.607 

0.191 

0.228 

0.324 

1.572 
0.338 
0.036 



AT4-0043 



Normal 



0.126 



AT4-0044 



Normal 



0.130 



AT4-0052 



Normal 



0.135 



AT4-0053 



Normal 



0.133 



AT4-0062 



Normal 



0.128 



AT4.0070 



Normal 



0.088 



AT4-0091 



Normal 



0.108 



AT4-0100 



Normal 



0.106 



AT4-0105 



AT4-0109 



Normal 



Normal 



0.108 



0.105 



The reactivity of the recombinant antigen (g) (SEQ ID No. 60) witii sera 
from M. tuberculosis patients and normal donors was detennined by ELISA as 
described above. Figure 6 shows the results of the titration of antigen (g) with four 
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M tuberculosis positive sera that were all reactive with the 38 kD antigen and with four 
donor sera. All four positive sera were reactive with antigen (g). 

From the foregoing, it will be appreciated that, although specific 
5 embodiments of the invention have been described herein for the puipose of illustration, 
various modifications may be made without deviating from the spirit and scope of the 
invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Corixa Corporation 

(1i) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR DIAGNOSIS OF 

TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 132 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center. 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0. Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 27-AU6-1996 

(C) CLASSIFICATION: 
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(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Maki , David J. 

(B) REGISTRATION NUMBER: 31.392 

(C) REFERENCE/DOCKET NUMBER: 210121. 417PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



CGAGGCACCG 


GTAGTTTGAA CCAAACGCAC 


AATCGACGGG 


CAAACGAACG GAAGAACACA 


60 


ACCATGAAGA 


TGGTGAAATC GATCGCCGCA 


GGTCT6ACCG 


CCGCGGCTGC AATCGGCGCC 


120 


GCT6CGGCCG 


GTGTGACnC GATCATGGCT 


GGCGGCCCGG 


TCGTATACCA GATGCAGCCG 


180 


GTCGTCTTC6 


GCGCGCCACT GCCGTTGGAC 


CCGGCATCCG 


CCCCTGACGT CCCGACCGCC 


240 


GCCCAGTTGA 


CCAGCCTGCT CAACAGCCTC 


GCCGATCCCA 


ACGTGTCGTT TGCGAACAAG 


300 


GGCAGTCTGG 


TCGAGGGCGG CATC6GGGGC 


ACCGAGGCGC 


GCATCGCCGA CCACAAGCTG 


360 


AAGAAGGCCG 


CCGAGCACG6 6GATCTGCCG 


CTGTCGTTCA 


GCGTGAC6AA CATCCAGCC6 


420 



766 
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GCGGCCGCCG GTTCGGCCAC CGCCGACGH TCCGTCTCGG GTCC6AAGCT CTCGTCGCCG 480 

GTCACGCAGA ACGTCACGTF CGTGAATCAA GGCGGCTGGA TGCTGTCACG CGCATCGGCG 540 

ATGGAGTTGC TGCAGGCCGC AGGGNAACTG AHGGCGGGC CGGNHCAGC CCGCTGTTCA 600 

GCTACGCCGC CCGCCTGGTG ACGCGTCCAT GTCGAACACT CGCGCGTGTA GCACGGTGCG 660 

GTNTGCGCAG GGNCGCACGC ACCGCCCGGT GCAAGCCGTC CTCGAGATAG GTGGTGNCTC 720 
GNCACCAGNG ANCACCCCCN NNTCGNCNNT TCTCGNTGNT GNATGA 
(2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

ATGCATCACC ATCACCATCA CGAT6AAGTC ACGGTAGAGA CGACCTCCGT CTTCCGCGCA 60 

GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 120 

GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 180 

TCCCGGHCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA 240 

TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 300 

TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 360 

TCGGCGGTGC TGGC6AACGG CGACGAGGTC CAGATC6GCA AGCTCCGGTT 6GTGTTCTTG 420 
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ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 480 

GCCCC6CGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGATTTT 540 

CCCTGAT6TC CACCATCTCC AAGAHCGAT TCHGGGAGG CHGAGGGTC NGGGTGACCC 600 

CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GG1TTCACCC CNTACCNACT GCCNCCCGGN 660 

TTGCNAATTC NTTCnCNCT GCCCNNAAAG GGACCNTTAN CHGCCGCTN 6AAANGGTNA 720 

TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 752 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTW: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

CATATGCATC ACCATCACCA TCACACHCT AACCGCCCAG CGCGTCGGGG GCGTC6AGCA 60 

CCACGC6ACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 120 

CAGCGCGATG CCCTATGTTr GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT 180 

GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGT6CGCAT CAAGATCHC 240 

ATGCT6GTCA CGGCTGTCGT TTTGCTCTGT TGHCGGGTG TGGCCACGGC CGCGCCCAAG 300 

ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 360 
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GACCCGGCCT ACAACATCM CATCAGCCTG CCCAGHACT ACCCCGACCA GAAGTCGCTG 420 

GAAAATTACA TCGCCCAGAC GC6CGACAAG HCCTCAGCG C6GCCACATC GTCCACTCCA 480 

CGCGAAGCCC CCTACGAAH GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 540 

CGT6GTACGC AG6CCGTGGT GCTCAMGGTC TACCACAACG CCGGCGGCAC GCACCCAACG 600 

ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTAT6ACACG 660 

CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTC7TCC CCAHGnGC AA6GTGAACT 720 

GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 780 

TGAAATTATC ACAACTTCGC AGTCACNAAA NAA 813 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 60 

CAHCCGATC GGGCAGGCGA TGGCGATCGC GGGCCAGATC CGATCGGGTG GGGGGTCACC 120 

CACCGTTCAT ATCGGGCCTA CCGCCTTCCT CGGCHGGGT GTTGTCGACA ACAACGGCAA 180 

CGGC6CACGA GTCCAACGCG TGGTCGGGAG CGCTCCGGCG GCAAGTCTCG GCATCTCCAC 240 

CGGCGACGTG ATCACCGCGG TC6ACGGCGC TCCGATCAAC TCGGCCACCG CGATGGCGGA 300 
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CGCGCTTAAC GGGCATCATC CCGGTGACGT CATCTCGGTG AACTGGCAAA CCMGTCGGG 360 

CGGCACGCGT ACAGGGAACG T6ACATTGGC CGAGGGACCC CCGGCCTGAT TTCGTCGYGG 420 

ATACCACCCG CCGGCCGGCC AATTGGA 447 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT 60 

CCGGTGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC .6GAAGTATCG GTCCATGCCT 120 

AGCCC6GCGA CGGCGAGCGC CGGAATGGCG CGAGTGAG6A GGCGGGCAAT TTGGCGGG6C 180 

CCG6C6ACGG NGAGCGCCGG AATGGCGCGA GTGAGGAGGT GGNCAGTCAT GCCCAGNGTG 240 

ATCCAATCAA CCTGNATTCG GNCTGNGGGN CCAITTGACA ATCGAGGTAG TGAGCGCAAA 300 

TGAATGATGG AAAACGGGNG GNGACGTCCG NTGHCTGGT GGTGNTAGGT GNCTGNCTGG 360 

NGTNGNGGNT ATCAGGATGT TCTTCGNCGA AANCTGATGN CGAGGAACAG GGTGTNCCCG 420 

NNANNCCNAN GGNGTCCNAN CCCNNNNTCC TCGNCGANAT CANANAGNCG NTTGATGNGA 480 

NAAAAGGGTG GANCAGNNNN AANTNGNGGN CCNAANAANC NNNANNGNNG NNAGNTNGNT 540 



wo 97/09429 PCT/US96/14675 

51 

NNNTNTTNNC ANNNNNNNTG NNGNNGNNCN NNNCAANCNN NTNNNNGNAA NNGGNTTNTT 600 
NAAT 



604 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 60 

CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 120 

TAACGGTCCT GHACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACC6ACGAA 180 

CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 240 

CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYC6 ACGACGACAT CGACGTCGTC 300 

ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG 360 

6CAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 420 

CGCGATCAAC GGCGCCGCGG TCACCGGCG6 GCTCGAACTG GCGCTGTACT GCGACATCCT 480 

GATCGCCTCC GAGCACGCCC GCHCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 540 

CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG 600 

CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 633 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

CGACGACGAC GGCGCCGGAG AGCGGGCGCG AACGGCGATC GACGCGGCCC TGGCCAGAGT 60 

CGGCACCACC CAGGAGGGAG TCGAATCATG AAATTTGTCA ACCATAHGA GCCCGTCGCG 120 

CCCCGCCGAG CCGGCGGCGC GGTCGCC6AG GTCTATGCCG AGGCCCGCCG CGAGHCGGC 180 

CG6CTGCCC6 AGCCGCTCGC CATGCT6TCC CCGGACGAGG GACTGCTCAC CGCC6GCTGG 240 

GC6ACGTTGC 6CGAGACACT GCTGGTGGGC CAGGTGCC6C 6TGGCCGCAA GGAAGCCGTC 300 

GCCGCCGCCG TCGCGGCCAG CCTGCGCTGC CCCTGGTGCG TC6ACGCACA CACCACCATG 360 

CTGTACGCGG CAGGCCAAAC CGACACCGCC GCGGCGATCT TGGCCGGCAC AGCACCTGCC 420 

GCCGGTGACC CGAACGCGCC GTATGTGGCG TGGGCGGCA6 GAACCGGGAC ACCGGCGGGA 480 

CCGCCGGCAC CGHCGGCCC GGATGTCGCC GCCGAATACC TGGGCACCGC GGTGCAATTC 540 

CACTTCATCG CACGCCTGGT CCTGGTGCTG CTGGACGAAA CCTTCCTGCC GG6GGGCCCG 600 

CGCGCCCAAC AGCTCATGCG CCGCGCCGGT GGACTGGTGT TCGCCC6CAA GGTGCGC6CG 660 

GA6CATC6GC CGGGCCGCTC CACCCGCCGG CTCGAGCCGC GAACGCTGCC CGACGATCTG 720 
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GCATGGGCM CACCGTCCGA GCCCATAGCA ACCGCGTTCG CCGCGCTCAG CCACCACCTG 780 

GACACCGCGC CGCACCTGCC GCCACCGACT CGTCAGGTGG TCAG6CGGGT CGTGGGGTCG 840 

TGGCACGGCG AGCCAATGCC GATGAGCAGT CGCTGGACGA ACGAGCACAC CGCCGAGCTG 900 

CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG 960 

GT6ACCGACG AC6ACGTCGC CGCGGCCCGA TCCCT6CTCG ACACCGATGC GGCGCT6GTT 1020 

GGCGCCCTGG CCTGGGCCGC CHCACCGCC GCGCGGCGCA TCGGCACCTG 6ATCGGCGCC 1080 

6CCGCCGAGG GCCAGGTGTC GCGGCAAAAC CCGACTGGGT GAGTGTGCGC GCCCTGTCGG 1140 

TAGGGTGTCA TCGCTGGCCC GAGGGATCTC GCGGCGGCGA ACGGAGGTGG CGACACA6GT 1200 

GGAAGCTGCG CCCACTGGCT TGCGCCCCAA CGCCGTCGTG GGCGTTC6GT TGGCCGCACT 1260 

GGCCGATCAG GTC6GCGCCG GCCCnGGCC GAAGGTCCAG CTCAACGTGC CGTCACCGAA 1320 

GGACCGGACG GTCACCGGGG GTCACCCTGC 6CGCCCAAGG AA 1362 
(2) INFORMATION FOR SEQ ID N0:8: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 
GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA CTCGGGCGCG 60 
GTATCGCTCC CGHGAGGAC ATTCAGGACT GCGTGGAGGC CCGGCTGGGG GAAGCCGGTC 120 
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TGGATGACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CT6CGGAC6G 180 

CTAAGGCCTT GCTC6GCGTG CGGGACGAGT TAAAGCT6AG CnGGCGGCC GT6ACGGTAC 240 

TGC6CGAGCG CTATCTGCT6 CACGACGAGC AGGGCCGGCC GGCCGAGTCG ACCGGCGAGC 300 

TGATGGACCG ATC6GCGCGC TGTGTCGCGG CGGCCGAGGA CCAGTATGAG CCGGGCTCGT 360 

CGAGGCGGTG GGCCGAGCGG HCGCCACGC TATTACGCAA CCTGGAATTC CTGCCGAATT 420 

CGCCCACGH GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT TTTGTTCTGC 480 

CGATTGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTG6G ACAGGCCGCC GAGCTGCAGC 540 

GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 600 

CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGTTTCT ACGGCTGTAT GACAGTGCCG 660 

CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT 720 

CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 780 

ATTTCAACCT ATCGGnGGT GT6ACCGACG CGHCCTGCG GGCCGTCGAA CGCAACGGCC 840 

TACACCGGCT GGTCAATCCG CGAACCGGCA AGATCGTCGC GCGGATGCCC GCCGCCGAGC 900 

TGHCGACGC CATCTGCAAA 6CC6CGCACG CCGGTGGCGA TCCCGGGCTG GTGTTTCTCG 960 

ACACGATCAA TAGGGCAAAC CCGGTGCCGG GGAGAGGCCG CATCGAGGCG ACCAACCCGT 1020 

GCGGGGAGGT CCCACT6CTG CCTTACGAGT CATGTAATCT CGGCTCGATC AACCTCGCCC 1080 

GGATGCTCGC CGACG6TCGC GTCGACTGGG ACCGGCTCGA GGAGGTCGCC GGTGT6GCGG 1140 

TGCGGTTCCT TGATGACGTC ATCGATGTCA GCCGCTACCC CHCCCCGAA CTGGGTGAGG 1200 
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CGGCCCGCGC CACCCGCAAG ATCGGGCTGG GAGTCATGG6 TTTGGCGGAA CTGCTTGCCG 1260 

CACTGGGTAT TCCGTACGAC AGTGAAGAA6 CCGTGCGGTT AGCCACCCGG CTCATGCGTC 1320 

GCATACAGCA GGCGGCGCAC ACGGCATCGC GGAGGCT6GC CGAA6AGCGG GGCGCATTCC 1380 

CGGCGTTCAC CGATAGCCGG HCGCGCGGT CG6GCCCGAG GCGCAACGCA CAGGTCACCT 1440 
CCGTC6CTCC GACGGGCA 
(2) INFORMATION FOR SEQ ID N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 

ACGGTGTAAT CGTGCTGGAT CTGGAACCGC GTGGCCCGCT ACCTACCGAG ATCTACT6GC 60 

GGCGCAGGGG GCTGGCCCTG GGCATCGCGG TCGTCGTAGT CGGGATCGCG GTGGCCATCG 120 

TCATCGCCn CGTCGACAGC AGCGCCGGTG CCAAACCGGT CAGCGCCGAC AAGCCGGCCT 180 

CCGCCCAGAG CCATCCGGGC TCGCCGGCAC CCCAA6CACC CCAGCCGGCC GGGCAAACCG 240 

AAGGTAACGC CGCCGCGGCC CCGCCGCAGG GCCAAAACCC CGAGACACCC ACGCCCACCG 300 

CCGCGGTGCA GCCGCCGCCG GTGCTCAAGG AAGG6GAC6A TTGCCCCGAT TCGACGCTGG 360 

CCGTCAAAGG TTTGACCAAC GCGCCGCAGT ACTACGTCGG CGACCAGCCG AAGHCACCA 420 
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TGGTGGTCAC CAACATCGGC CTGGTGTCCT GTAMCGCGA CGTTGGGGCC GC6GTG7TGG 480 

CCGCCTAC6T TTACTCGCT6 GACAACAAGC GGTTGTG6TC CAACCTGGAC TGCGCGCCCT 540 

CGAATGAGAC GCTGGTCAAG ACGTnTCCC CCGGTGAGCA GGTAACGACC GCGGTGACCT 600 

GGACC66GAT GG6ATCGGCG CC6CGCTGCC CATTGCCGCG GCCGGCGATC GGGCCGGGCA 660 

CCTACAATCT CGTGGTACAA CTGGGCAATC TGCGCTCGCT GCCGGHCCG HCATCCTGA 720 

ATCAGCCGCC GCCGCCGCCC GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCCTCCGC 780 

CGGAGTCTCC CGCGCAAGGC GGATAAHAT TGATCGCTGA TGGTCGATTC CGCCAGCTGT 840 

GACAACCCCT CGCCTCGTGC CG 862 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 

HGATCAGCA CC.GGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC CAATGACAAA 60 

GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC GAACGCTGGA 120 

GTGCCGAAGG 6CGTCGTTGT CACCAA6GTC GACGACCGCC C6ATCAACAG CGCGGACGCG 180 

TTGGTT6CCG CCGTGCGGTC CAAAGCGCC6 GGC6CCACGG TG6CGCTAAC CTTTCAGGAT 240 

CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC ACCCTC6GCA AGGCGGAGCA GTGATGAAGG 300 
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TCGCC6CGCA GTGTTCAAAG CTCGGATATA CGGTGGCACC CAT6GAACAG CGT6CGGAGT 360 

TGGTGGTTGG CCGGGCACTT GTCGTCGTCG HGACGATCG CACG6CGCAC GGCGATGAAG 420 

ACCACAGCGG GCCGCTTGTC ACCGAGCTGC TCACCGAGGC CGG6TTTGTT GTCGACGGCG 480 

TGGTGGCGGT GTCGGCCGAC GAGGTCGAGA TCCGAAATGC GCT6AACACA GCG6TGATCG 540 

GC6GGGTGGA CCTGGTGGTG TC6GTCGGCG GGACCG6NGT GACGNCTCGC GATGTCACCC 600 

CGGAAGCCAC CCGNGACATT CT 622 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 

GGCGCAGCGG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACA6CATG CGGCGGTGGC 60 

ACCAACAGCT CGTCGTCAGG CGCAGGCG6A ACGTCTGGGT CGGTGCACTG CGGC6GCAAG 120 

AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG CCATGGAGCA GTTCGTCTAT 180 

GCCTAC6TGC GATCGTGCCC GGGCTACACG TTGGACTACA ACGCCAACGG 6TCCGGTGCC 240 

6GGGTGACCC AGTITCTCAA CAACGAAACC GATTTCGCCG GCTCGGATGT CCCGTTGAAT 300 

CCGTCGACCG GTCAACCT6A CC6GTC6GCG GAGCGGTGCG GTTCCCCGGC ATGGGACCTG 360 
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CCGACGGTGT TCGGCCCGAT CGCGATCACC TACAATATCA AGGGCGTGAG CACGCTGAAT 420 

CTTGACGGAC CCACTACCGC CAAGATTTTC AACGGCACCA TCACCGTGTG GAATGATCCA 480 

CAGATCCAAG CCCTCAACTC CGGCACCGAC CTGCCGCCAA CACCGATTAG CGTTATCTTC 540 

CGCAGCGACA AGTCCGGTAC GTCGGACAAC HCCAGAAAT ACCTCGACGG TGTATCCAAC 600 

GGGGCGTGGG 6CAAA6GCGC CAGCGAAACG TTCAGCGGGG GCGTCGGCGT CGGCGCCAGC 660 

G6GAACAACG GAACGTCGGC CCTACTGCAG ACGACCGACG GGTCGATCAC CTACAACGAG 720 

T6GTCGTTTG CGGTGGGTAA GCAGHGAAC ATGGCCCAGA TCATCACGTC 6GCGGGTCCG 780 

GATCCA6T6G CGATCACCAC C6AGTCGGTC GGTAAGACAA TCGCCGGGGC CAAGATCATG 840 

GGACAAGGCA ACGACCTGGT AHGGACACG TCGTCGTTCT ACAGACCCAC CCAGCCTGGC 900 

TCTTACCCGA TCGTGCTGGC GACCTATGAG ATCGTCTGCT CGAAATACCC GGATGCGACG 960 

ACCGGTACTG CGGTAAGGGC GTHATGCAA GCCGCGATTG GTCCAGGCCA AGAAGGCCTG 1020 

6ACCAATACG GCTCCAHCC GTTGCCCAAA TCGTTCCAAG CAAAAHGGC GGCCGCGGTG 1080 

AATGCTATTT CTTGACCTAG TGAAGGGAAT TCGACGGT6A GCGAT6CCGT TCCGCAGGTA 1140 

GGGTCGCAAT TTGGGCCGTA TCAGCTATTG CGGCTGCTGG GCCGAGGCGG GAT6GGCGAG 1200 

(2) INFORMATION FOR SEQ ID N0:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 

GCAAGCA6CT GCAGGTCGTG CTGTTCGACG AACTG6GCAT GCCGAAGACC AAAC6CACCA 60 

AGACCGGCTA CACCACGGAT GCCGACGCGC TGCAGTCGH GTTCGACAAG ACCGGGCATC 120 

CGTTTCT6CA ACATCTGCTC GCCCACCGCG ACGTCACCCG GCTCAAGGTC ACCGTCGACG 180 

GG7TGCTCCA AGCGGTGGCC GCCGACGGCC GCATCCACAC CACGTTCAAC CAGACGATCG 240 

CCGCGACCGG CCGGCTCTCC TCGACCGAAC CCAACCTGCA GAACATCCCG ATCC6CACCG 300 

ACGCGGGCCG GCGGATCCGG GACGC6TTCG TGGTCGGGGA CGGTTACGCC GAGTTGATGA 360 

CGGCCGACTA CAGCCAGATC GAGATGCGGA TCATGGGGCA CCTGTCCGGG GACGAGGGCC 420 

TCATCGAGGC GTTCAACACC GGGGAGGACC TGTAnCGTT CGTCGCGTCC CGGGTGTTCG 480 

GT6TGCCCAT CGACGAGGTC ACCGGCGAGT TGCGGCGCCG GGTCAAGGCG ATGTCCTACG 540 

6GCTGGTTTA CGGGTTGAGC GCCTACGGCC TGTC6CAGCA GHGAAAATC TCCACCGAGG 600 

AAGCCAACGA GCAGATGGAC GCGTATTTCG CCCGAHCGG CGGGGTGCGC GACTACCTGC 660 

GCGCCGTAGT CGAGCGGGCC CGCAAGGACG GCTACACCTC GACGGTGCTG GGCCGTCGCC 720 

GCTACCTGCC CGAGCTGGAC AGCAGCAACC GTCAAGTGCG GGAGGCCGCC GAGCGGGCGG 780 

CGCTGAACGC GCCGATCCAG GGCAGCGCGG CCGACATCAT CAAGGTGGCC ATGATCCAGG 840 

TCGACAAGGC GCTCAACGAG GCACAGCTGG CGTCGCGCAT GCTGCTGCAG GTCCACGACG 900 

AGCTGCTGH CGAAATCGCC CCCGGTGAAC GCGAGCGGGT CGAGGCCCTG 6TGCGCGACA 960 
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AGATGGGCGG CGCTTACCCG CTCGACGTCC C6CTG6AGGT GTCGGTGGGC TACGGCCGCA 1020 

GCT6GGACGC GGCGGCGCAC T6AGTGCCGA GCGT6CATCT GGGGCGGGAA nCGGCGAH 1080 

TTTCCGCCCT 6AG1TCACGC TCGGCGCAAT CGGGACCGA6 TTTGTCCAGC GTGTACCCGT 1140 

CGAGTAGCCT CGTCA 1155 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1771 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAGCGCCGTC TGGTGTTTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC GTTGCCGGGT 60 

TCGGGCCTCG GGHGGCGAT CGTCAAACAG GTGGTGCTCA ACCACGGCGG ATTGCTGCGC 120 

ATCGAAGACA CCGACCCAGG CGGCCAGCCC CCTGGAACGT CGATTTACGT GCTGCTCCCC 180 

GGCCGTCGGA TGCCGATTCC GCAGCTTCCC GGTGCGACGG CTGGCGCTCG GAGCACGGAC 240 

ATCGAGAACT CTCGGGGHC GGCGAACGTT ATCTCAGTGG AATCTCAGTC CACGCGCGCA 300 

ACCTAGTTGT GCAGTTACTG HGAAAGCCA CACCCATGCC AGTCCACGCA TGGCCAAGTT 360 

GGCCCGAGTA GTGGGCCTAG TACAGGAAGA GCAACCTAGC GACATGACGA ATCACCCACG 420 

GTATTCGCCA CCGCCGCAGC AGCCGGGAAC CCCAGGTTAT GCTCAGGGGC AGCAGCAAAC 480 

GTACAGCCAG CAGHCGACT GGCGHACCC ACCGTCCCCG CCCCCGCAGC CAACCCAGTA 540 
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CCGTCAACCC TACGAGGCGT T6GGTGGTAC CCGGCCGGGT CTGATACCTG GCGTGATTCC 600 

GACCATGACG CCCCCTCCTG GGAIGGHCG CCAAC6CCCT CGTGCAGGCA TGTTGGCCAT 660 

C6GCGCGGTG ACGATAGCGG TGGTGTCCGC CGGCATCGGC 6GC6C66CCG CATCCCTGGT 720 

CGGGTTCAAC CGGGCACCCG CCGGCCCCAG CGGCGGCCCA GTGGCTGCCA 6CGCGGC6CC 780 

AAGCATCCCC GCAGCAAACA T6CCGCCGGG GTCGGTCGAA CAGGT6GCGG CCAAGGTGGT 840 

GCCCAGTGTC GTCATGnGG AAACCGATCT GGGCCGCCAG TCGGAGGAGG GCTCCGGCAT 900 

CATTCTGTCT GCCGAGGGGC TGATCTTGAC CAACAACCAC GTGATCGCGG CGGCCGCCAA 960 

6CCTCCCCTG GGCAGTCCGC CGCCGAAAAC GACGGTAACC HCTCTGACG GGCGGACCGC 1020 

ACCCTTCACG GTGGTGGGG6 CTGACCCCAC CAGTGATATC GCCGTCGTCC GTGTTCAGGG 1080 

CGTCTCCGGG CTCACCCCGA TCTCCCTGGG UCCTCCTCG GACCTGAGGG TCGGTCAGCC 1140 

GGTGCTGGCG ATCGGGTCGC CGCTCGGTTT GGAGGGCACC GTGACCAC6G 6GATCGTCAG 1200 

CGCTCTCAAC CGTCCAGTGT CGACGACC6G CGAGGCCGGC AACCAGAACA CCGTGCTGGA 1260 

CGCCATTCAG ACCGACGCCG CGATCAACCC CGGTAACTCC GGGGGCGCGC TGGTGAACAT 1320 

GAACGCTCAA CTCGTCGGAG TCAACTCGGC CAHGCCACG CT6GGCGCGG ACTCAGCCGA 1380 

TGCGCAGAGC GGCTCGATCG GTCTCGGTTT TGCGAHCCA GTCGACCAGG CCAAGCGCAT 1440 

CGCCGACGAG TTGATCAGCA CCGGCAAGGC GTCACATGCC TCCCT6GGTG TGCAGGT6AC 1500 

CAATGACAAA GACACCCCGG GCGCCAA6AT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC 1560 

GAACGCTGGA GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG 1620 
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CGCGGACGCG TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTMC 1680 
CTTTCAGGAT CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC ACCCTCGGCA AGGCGGAGCA 1740 



GTGATGAAGG TCGCCGCGCA GTGHCAAAG C 
(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



1771 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTCCACCGCG GTGGCGGCCG CTCTAGAACT A6TGGATCCC CCGGGCTGCA GGAAHCGGC 60 

ACGAGGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC 66AAGTATCG GTCCATGCCT 120 

AGCCCGGCGA CGGCGAGCGC C6GAAT6GCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 180 

CCGGCGACGG CGAGCGCC6G AATGGCGCGA GTGAGGAGGC GGGCAGTCAT 6CCCAGCGTG 240 

ATCCAATCAA CCTGCATTCG GCCTGCGGGC CCATTTGACA ATC6AGGTAG TGAGCGCAAA 300 

TGAATGAT6G AAAACGGGCG 6TGACGTCCG CTGHCTGGT GGTGCTAGGT GCCTGCCT6G 360 

CGTTGTG6CT ATCAGGATGT TCTTC6CCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 420 

TGAGCCCGAC GGCGTCCGAC. CCCGCGCTCC TCGCC6AGAT CAGGCAGTCG CTTGATGCGA 480 

CAAAAGGGH GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 540 
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TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 
GCACCTACAA CGACGAGCAG GGT6TCCC6T TTCGGGTACA AGGCGACAAC ATCTCGGTGA 660 
AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 720 
TCGATCCTGC C6CTGGGGTG ACGCAGCTGC TGTCCGGTGT CAC6AACCTC CAAGCGCAAG 780 
GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 840 

CTGTCAAGAT GCTTGATCCT 6GCGCCAAGA GTGCAAGGCC 6GCGACCGTG TGGAnGCCC 900 

AGGACG6CTC GCACGACCTC GTCCGAGCGA GCATCGACCT CGGATCCG6G TCGATTCAGC 960 

TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGH GCGTC6ACGC 1020 

G1TGNTCGAA ACGCCCTTGT GAACGGTGTC AACGGNAC 1058 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 

GAAHCGGCA CGAGA6GTGA TCGACATCAT CG6GACCAGC CCCACATCCT GGGAACAGGC 60 

6GCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 120 

CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATG ACCTACCGCA TCAAGCTCGA 180 

AGTGTCGTTC AAGATGAGGC CGGCGCAACC GC6CTAGCAC GGGCCGGCGA GCAAGACGCA 240 
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AAATCGCACG GTTTGCGGTT GAHCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 300 

GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGAHCC GGCGGCCACG 360 

CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCC6GN6AGC T6ATCGATGA 420 

CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 480 

AGCGTCCGTA GGCGGCGGT6 CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 540 



542 



GG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

C6GTGCCGCC CGCGCCTCCG HGCCCCCAT TGCCGCCGTC 6CCGATCAGC TGCGCATCGC 60 

CACCATCACC GCCTTTGCCG CCGGCACCGC CGGTGGCGCC GGGGCC6CCG ATGCCACCGC 120 

TTGACCCTGG CCGCCGGCGC CGCCAHGCC ATACAGCACC CCGCCGGGGG CACCGTTACC 180 

GCC6TCGCCA CCGTCGCCGC CGCTGCCGTT TCAGGCCGGG 6AGGCCGAAT GAACCGCCGC 240 

CAAGCCCGCC GCCGGCACCG TTGCCGCCTT TTCCGCCCGC CCC6CCGGCG CCGCCAATTG 300 

CCGAACAGCC AMGCACCGH GCCGCCAGCC CCGCCGCCGT TAACGGCGCT GCCGGGCGCC 360 
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GCCGCCGGAC CCGCCATTAC CGCCGTTCCC GTTCGGTGCC CCGCCGTTAC C6GC6CCGCC 
GTTTGCC6CC AATATTCGGC GGGCACCGCC AGACCCGCCG GGGCCACCAT TGCCGCCGGG 
CACCGAAACA ACAGCCCAAC GGTGCCGCCG GCCCCGCCGT TTGCCGCCAT CACCGGCCAT 
TCACCGCCAG CACCGCCGTT AATGTTTATG AACCCGGTAC CGCCAGCGCG GCCCCTATTG 
CCGGGCGCCG GAGNGCGTGC CC6CCGGCGC CGCCAACGCC CAAAAGCCCG GGGTTGCCAC 
CGGCCCCGCC GGACCCACCG GTCCCGCCGA TCCCCCCGH GCCGCCGGTG CCGCCGCCAT 
TGGTGCTGCT GAAGCCGHA GCGCCGGHC CGCSGGTTCC GGCGGTGGCG CCNTGGCCGC 
CGGCCCCGCC GTTGCCGTAC AGCCACCCCC CGGTGGCGCC GTTGCCGCCA HGCCGCCAT 
TGCCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGCCACC GCCGGNnGG CCGCCGGCGC 
CGCCGGCGGC CGC 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT AAGGCTGGGA CAATTTCTGA 
TAGCTACCCC GACACAGGA6 GHACGGGAT GAGCAATTCG CGCCGCC6CT CACTCAGGTG 
GTCATGGTTG CTGAGCGTGC TGGCTGCCGT CGGGCTGGGC CTGGCCACGG CGCCGGCCCA 



420 
480 
540 
600 
660 
720 

780 

840 

900 

913 



60 
120 
180 



wo 97/09429 



66 



PCT/US96/14675 



GGCGGCCCCG CCGGCCTTGT CGCAG6ACCG GTTCGCCGAC TTCCCCGCGC TGCCCCTCGA 240 

CCCGTCCGCG ATGGTC6CCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACT6GG 300 

CTACAACAAC GCCGTGGGCG CCGGGACCG6 CATCGTCATC GATCCCAACG GTGTCGTGCT 360 

6ACCAACAAC CACGTGATCG CGGGCGCCAC CGACATCAAT GCGHCAGCG TCGGCTCCGG 420 

CCAAACCTAC GGCGTCGATG TGGTCGGGTA TGACCGCACC CAGGATGTCG CGGTGCTGCA 480 

GCTGCGCGGT GCCGGTGGCC TGCCGTCGGC GGCGATCGGT GGCGGCGTCG CGGTTGGTGA 540 

GCCCGTCGTC GCGATGGGCA ACAGCGGTGG GCAGGGCGGA ACGCCCCGT6 CGGTGCCTGG 600 

CAGGGTGGTC GCGCTCG6CC AAACCGTGCA GGCGTCGGAT TCGCTGACCG GT6CCGAAGA 660 

GACATTGAAC GGGTTGATCC AGTTCGATGC CGCAATCCAG CCCGGIGAH CGGGCGG6CC 720 

CGTCGTCAAC GGCCTAGGAC AGGTGGTC6G TATGAACACG 6CCGCGTCCG ATAACTTCCA 780 

GCTGTCCCAG GGTGGGCAGG GATTCGCCAT TCCGATCGGG CAGGCGATGG CGATCGCGGG 840 

CCAAATCCGA TCGGGTGGGG GGTCACCCAC CGHCATATC GGGCCTACCG CCTTCCTCG6 900 

CnGGGTGTT GTCGACAACA ACGGCAACGG CGCACGAGTC CAACGCGTGG TCGGAAGCGC 960 

TCCGGCGGCA AGTCTCGGCA TCTCCACC6G CGACGTGATC ACC6CGGTCG ACGGCGCTCC 1020 

GATCAACTCG GCCACCGCGA TGGCGGAC6C GCTTAACGGG CATCATCCCG GTGACGTCAT 1080 

CTCGGTGAAC TGGCAAACCA AGTCGGGCGG CACGCGTACA GGGAACGTGA CATTGGCCGA 1140 

GGGACCCCCG GCCTGATTTG TCGCGGATAC CACCCGCCGG CCGGCCAATT GGAHGGCGC 1200 

CAGCCGTGAT TGCCGCGTGA GCCCCC6AGT TCCGTCTCCC GTGCGCGTGG CATTGTGGAA 1260 
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GCAATGAACG AGGCAGAACA CAGCGTTGAG CACCCTCCCG T6CAGGGCAG HACGTCGAA 1320 

GGCGGTGT6G tCGAGCATCC GGATGCCAAG GACTTCGGCA GCGCCGCCGC CCTGCCCGCC 1380 

GATCCGACCT GGTHAAGCA CGCCGTCHC TACGAGGTGC TGGTCCG6GC GTTCTTCGAC 1440 

GCCAGCGCGG ACGGTTCCGN CGATCTGCGT GGACTCATCG ATCGCCTCGA CTACCTGCAG 1500 

TGGCTT6GCA TCGACTGCAT CTGTT6CCGC CGHCCTACG ACTCACCGCT GCGCGACGGC 1560 

GGTTACGACA TTCGCGAC7T CTACAAGGTG CTGCCCGAAT TCGGCACCGT CGACGATTTC 1620 

GTCGCCCTGG TCGACACCGC TCACCGGCGA GGTATCCGCA TCATCACCGA CCTGGTGATG 1680 

AATCACACCT CGGAGTCGCA CCCCT6GTTT CAGGAGTCCC GCCGCGACCC AGACGGACCG 1740 

TACGGTGACT ATTACGTGTG GAGCGACACC AGCGAGCGCT ACACCGACGC CCGGATCATC 1800 

nCGTCGACA CCGAAGAGTC GAACTGGTCA HCGATCCTG TCCGCCGACA GTTNCTACTG I860 

GCACCGAHC TT ^372 
(2) INFORMATION FOR SEQ ID N0:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 
CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG GCGTCCGACC 60 
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CCGC6CTCCT CGCCGAGATC AGGCAGTCGC TTGATGC6AC AAAAGGGTTG ACCAGCGTGC 120 

ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACA6CTTGCT GGGTATTACC AGTGCCGATG 180 

TCGACGTCCG-GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC GACGAGCAGG 240 

GTGTCCCGTT TCG6GTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC GACT6GAGCA 300 

ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC GCTGGGGTGA 360 

CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG TACCGAAGTG ATAGACGGAA 420 

TTTCGACCAC CAAAATCACC GG6ACCATCC CCGCGAGCTC TGTCAAGATG CTTGATCCTG 480 

GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG CACCACCTCG 540 

TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG AAATGGAACG 600 

AACCCGTCAA CGTCGACTAG GCCGAAGTTG CGTCGACGCG HGCTCGAAA CGCCCHGIG 660 

AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAAAAT TGACCCCCTA 720 

GACCGGGCGG HGGTGGHA nCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 780 

CGGTCTTTGA GCCGGTAGCT GTCGCCTTTG AGGGCGACGA CTTCAGCATG GTGGACGAGG 840 

CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 900 

AAGGCCHAT TGGACGTGAC GATCAAGCTG GCCC6CTCAT ACCGGGAG6A CACCAGCTGG 960 

AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACHCGTC AACCACCA6G 1020 

AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC 6CCCGGCGTG GTGAGCCTCG 1080 

6CGAACCGTG CTACCCAHC GGCGGCGGTG GCGAACAGCA CCCGATGACC GGCCTGACAC 1140 
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GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCHCCCGG TGCCA6GC6G GGCCCAAAAA 1200 

CACGACGHA TCGCG6GCGG TGATGAAATC CA6GGTGCCC AGATGTGCGA T6GTGTCGCG 1260 

TTT6AGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC GACTTCCGAA CCGGGAAGC6 1320 

GGCGGCGCGG AT6CGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCGCTGCAG 1380 

GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG C6GGCGCGAT C6GCCA6CCG 1440 

GGACACTGAC TCACGCAGG6 TGGGAGCTTT CAATGCTCTT 6T 1482 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT 60 

CGTGCTCGGG 6CCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 

CGGTCACTCG TT6CTGCT6G ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 180 

CGCCTACGAA ATCGGCTACA TCGNG6AAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 240 

GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 

GGAGAACTTC GATCCCGAGG GCGTGCTGGG G6GTATCTAC CGNTATCACG CGGCCACCGA 360 

GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 420 
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GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 480 

7TGGGGCGAG CTAAACCGC6 ACGGGGT6GT CATCGAGACC GAGAAGCTCC GCCACCCC6A 540 

TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGA6 AAT6CTC6GG GCCCGGTGAT 600 

CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 660 

ATACGTCACG TTGGGCACCG ACGGGTTCGG TmTCCGAC ACTCGGCCCG CCGGTCGTCG 720 

nACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTTGGGAG6G GTTGGCCGGG 780 

TCGACG6GTG AATATCGACC CAHCGGIGC CGGTCGTGGG CCGCCCGCCC AGHACCCGG 840 

ATTCGACGAA GGTGGGGGGT TGCGCCC6AN TAAGTT S76 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20: 

ATCCCCCCGG GCTGCAGGAA TTCGGCAC6A 6AGACAAAAT TCCACGGGH AAT6CAGGAA 60 

CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGC6GT TTATrTCGAC 120 

AGCGAAGACC TGCCGCA6TT GGCGAAGCAT TTTTACAGCC AAGCGGTC6A GGAACGAAAC 180 

CATGCAATGA TGCTCGTGCA ACACCTGCTC GACCGCGACC TTCGTGTCGA AATTCCCGGC 240 
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GTAGACAC6G TGCGAAACCA GHCGACAGA CCCCGCGAGG CACTGGCGCT GGCGCTCGAT 300 

CAGGAACGCA CAGTCACCGA CCAGGTCGGT CGGCTGACAG CGGTGGCCCG CGACGAGGGC 360 

GATTTCCTCG GCGAGCAGH CATGCAGT6G TTCTTGCAGG AACAGATCGA AGA6GTGGCC 420 

TTGATGGCAA CCCTGGTGCG GGTTGCCGAT CGGGCCGGGG CCAACCTGTT CGAGCTAGAG 480 

AACnCGTCG CACGTGAAGT GGATGT6GCG CCGGCCGCAT CAGGCGCCCC GCACGCTGCC 540 

GGGGGCCGCC TCTAGATCCC TGGGGGGGAT CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT 600 

TCCAGCCAGG CCTTGGT6CG GCCGGGGTGG TGAGTACCAA TCCAGGCCAC CCCGACCTCC 660 

CGGNAAAAGT CGATGTCCTC GTACTCATCG ACGTTCCAGG AGTACACCGC CCGGCCCTGA 720 

6CTGCCGAGC GGTCAACGAG HGCGGATAT TCCTTTAACG CAGGCAGTGA GGGTCCCACG 780 

GCGGTT6GCC CGACCGCCGT GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CHGGCGAGC 840 

AACAACGTCG GCAGGAGGGG TGGAGCCCGC CGGATCCGCA GACCGG6GGG GCGAAAACGA 900 

CATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGC6 GGAATACC6A ACCGGTGTAG 960 

GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTTTTCG GGTCATCGGN GGCNNTTAAG 1020 

T 1021 
(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 

CGTGCCGACG AACGGAAGAA CACAACCATG M6ATGGTGA AATCGATCGC CGCAGGTCTG 60 

ACCGCCGC6G CT6CAATC6G CGCCGCTGCG GCCGGTGTGA CTTC6ATCAT GGCTG6CGGN 120 

CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCC6GNA 180 

TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 240 

CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 300 

GGNGNGNATC GNCGANCACA A 321 
(2) INFORMATION FOR SEQ ID N0:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

TCTTATCGGT TCCGGTTGGC GACGGGTnT GGGNGCGGGT GGHAACCCG CTCGGCCAGC 60 

C6ATCGACGG GCGCGGAGAC GTCGACTCCG ATACTCGGCG CGCGCTGGAG CTCCAGGCGC 120 

CCTCGGTGGT GNACCGGCAA GGCGTGAAGG AGCCGTTGNA 6ACCGGGATC AAGGCGAHG 180 

ACGCGATGAC CCCGATCGGC CGCGGGCAGC GCCAGCTGAT CATCGGGGAC CGCAAGACCG 240 

GCAAAAACCG CCGTCTGTGT CGGACACCAT CCTCAAACCA GCGGGAAGAA CTGGGAGTCC 300 
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GGTGGATCCC MGAAGCAGG TGCGCHGIG TATACGTTGG CCATCGGGCA AGAAGGGGAA 360 



CTTACCATCG CCG 
(2) INFORMATION FOR SEQ ID N0:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



373 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23: 

GT6ACGCCGT GAT6GGATTC CTGGGCGGGG CCG6TCCGCT GGCGGTGGTG GATCAGCAAC 60 

TGGHACCCG 6GTGCCGCAA GGCTGGTCGT TTGCTCAGGC AGCCGCTGTG CCGGTGGTGT 120 

TCTTGACG6C CTGGTACGG6 nGGCCGATT TAGCCGAGAT CAAGGCGGGC GAATCGGTGC 180 

T6ATCCATGC CGGTACCGGC GGTGTGGGCA TGGCGGCTGT GCAGCTGGCT CGCCAGTGGG 240 

GCGT6GAGGT TTTCGTCACC GCCAGCCGTG GNAA6TGGGA CACGCTGCGC GCCATNGNGT 300 

TTGACGACGA NCCATATCGG NGATTCCCNC ACATNCGAAG TTCCGANGGA GA 352 
(2) INFORMATION FOR SEQ ID N0:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GAAATCC6CG TTCATTCCGT TCGACCAGCG GCTGGCGATA ATCGACGAAG TGATCAAGCC 60 

GCGGTTC6CG GCGCTCATGG GTCACA6CGA GTAATCAGCA AGHCTCIGG TATATCGCAC 120 

CTAGCGTCCA GTTGCnGCC AGATCGCTTT CGTACCGTCA TC6CATGTAC CGGTTCGCGT 180 

GCCGCACGCT CATGCTGGCG GCGTGCATCC TGGCCACG6G TGTGGCGGGT CTCGGGGTCG 240 

GCGCGCAGTC CGCAGCCCAA ACCGCGCCGG TGCCCGACTA CTACT6GTGC CCGGGGCAGC 300 

CTTTCGACCC CGCATGGGGG CCCAACTGGG ATCCCTACAC CTGCCATGAC GACHCCACC 360 

GCGACAGCGA CGGCCCCGAC CACAGCCGCG ACTACCCCGG ACCCATCCTC GAAGGTCCCG 420 

TGCTTGACGA TCCCGGTGCT GCGCCGCCGC CCCCGGCTGC CGGTGGCGGC GCATAGCGCT 480 

CGHGACCGG GCCGCATCAG CGAATAC6CG TATAAACCCG GGCGTGCCCC CGGCAAGCTA 540 

CGACCCCCGG CGGGGCAGAT TTACGCTCCC GTGCCGATGG ATCGCGCCGT CCGATGACA6 600 

AAAATAGGCG ACGGTTTTGG CAACCGCHG GAGGACGCTT GAAGGGAACC TGTCATGAAC 660 

GGCGACAGCG CCTCCACCAT CGACATC6AC AAGGnGTTA CCCGCACACC CGTTCGCCGG 720 
ATCGTG 

(2) INFORMATION FOR SEQ ID N0:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



726 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:25: 
CGCGACGACG ACGAACGTCG GGCCCACCAC CGCCTATGCG TT6ATGCAGG CGACCGGGAT 
GGTCGCCGAC CATATCCAAG CATGCTGGGT GCCCACTGAG CGACCTTTTG ACCAGCCGGG 
CTGCCCGATG 6CGGCCCGGT GAAGTCAHG CGCCGGGGCT TGT6CACCTG ATGAACCCGA 
ATAGGGAACA ATAGGGGGGT GAUrGGCAG TTCAATGTCG GGTATGGCTG 6AAATCCAAT 
GGCGGGGCAT GCTCGGCGCC GACCAGGCTC GCGCAGGCGG GCCAGCCCGA ATCTGGAGGG 
AGCACTCAAT GGC66CGATG AAGCCCCGGA CCGGCGACGG TCCTTT66AA GCAACTAA6G 
AGGGGCGCGG CAHGTGATG CGAGTACCAC TTGAGGGTGG CGGTCGCCTG GTCGTCGAGC 
TGACACCCGA CGAAGCCGCC GCACTGGGTG ACGAACTCAA AGGCGTTACT AGCTAAGACG 
AGCCCAACGG CGAAT6GTCG GCGHACGCG CACACCHCC GGTAGATGTC CAGTGTCTGC 
TCGGCGATGT ATGCCCAGGA GAACTCHGG ATACAGCGCT 
(2) INFORMATION FOR SEQ ID N0:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26: 
AACGGAGGCG CCGGGGGTIT TGGCGGGGCC GGGGCGGTCG GCGGCAAC6G CGGGGCCGGC 



60 
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6GTACCGCCG GGTTGnCGG TGTCGGCGGG GCCGGTGGGG CC6GAGGCAA CGGCATCGCC 
GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

GACACCGATA CGAT6GTGAT GTACGCCAAC GHGTCGACA CGCTCGA6GC GTTCACGATC 
CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGHCGCGGA GGCGGCTGCC 
AA6GCGATGG GAATCGACAA GCTGCGGGTA ATTCATACC6 GAATGGACCC CGTCGTC6CT 
GAACGCGAAC AGTGG6ACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 
GCCTACGAGC GCAACGTACA GACCAACGCC CG 
(2) INFORMATION FOR SEQ ID NO: 28: 

( 1 ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28: 
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GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGT6ACGCA GCGCGACGTG CGCGAGCTGA 60 

AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 

CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 180 

GTTCGGATCT GGCGTGGHC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 240 

GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 300 
CGGCCTGG7T GCGCGGG 
(2) INFORMATION FOR SEQ ID N0:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29: 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCG6CCA6CA CGTCGGTGTA 60 
GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 120 
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGC6 GCGGCGCCGG ACGCCGCCGT 180 
GG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:30: 

GATC6CGAAG TTTGGTGAGC AGGTGGTCGA CGC6AAAGTC TGGGCGCCTG CGAAGC6GGT 60 

C6GCGTTCAC GAGGCGM6A CACGCCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 120 

GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT GCATCCTCAT 180 

GAGACTCGGC GGHAGGCAT T6ACCATGGC GTGTACCGCG TGCCCGACGA TTTGGACGCT 240 

CCGTTGTCAG ACGACGTGCT CGAACGCTTT CACCGGTGAA GCGCTACCTC ATCGACACCC 300 

ACGTTTGG 308 

(2) INFORMATION FOR SEQ ID N0:31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 

CC6ACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CAHGAGGAC GGAGAGAATC 60 

CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 120 

6GCACCGGAC TATTCTGGTG T6CC6CT6GC CGGTAAGAGC G6GTAAAAGA ATGTGAGGGG 180 

ACACGATGAG CAATCACACC TACC6AGTGA TCGAGATCGT CG6GACCTCG CCCGACGGCG 240 
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TCGACGCGGC AATCCAG6GC GGTCTGG 



267 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:32: 

CTCGT6CCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA 6TGATCGAGA 60 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 

CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGG 189 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:33: 
CTGCAGGGTG GCGTG6ATGA GCGTCACC6C GGGGCAGGCC GAGCTGACCG CCGCCCAGGT 60 



CCGGGTTGCT GCGGC6GCCT ACGAGACGGC GTATGGGCTG ACGGTGCCCC CGCCGGTGAT 
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CGCC6AGAAC CGTGCTGAAC TGATGATTCT GATAGCGACC AACCTCTTGG GGCAAAACAC 180 

CCCGGCGATC GCGGTCAACG AGGCCGAATA C6GCGAGATG TGGGCCCAAG ACGCCGCCGC 240 

GATGTTTGGC TACGCCGCGG CGACGGCGAC GGCGACGGCG ACGTTGCTGC CGHCGAGGA 300 

6GCGCCGGAG ATGACCAGCG CGGGTGGGCT CCTCGAGCAG GCC6CCGCGG TCGAGGAGGC 360 

CTCCGACACC GCCGCGGCGA ACCAGTTGAT GAACAATGTG CCCCAGGCGC TGAAACAGTT 420 

GGCCCAGCCC ACGCAGGGGA CCACGCCTTC TTCCAAGCTG GGTGGCCTGT GGAAGACGGT 480 

CTCGCCGCAT CGGTCGCCGA TCAGCAACAT GGTGTCGATG GCCAACAACC ACATGTCGAT 540 

GACCAACTCG GGTGTGTCGA TGACCAACAC CHGAGCTCG ATGHGAAGG GCTTTGCTCC 600 

GGCGGCGGCC GCCCAGGCCG TGCAAACCGC GGCGCAAAAC GGG6TCCGGG CGATGAGCTC 660 

GCTGGGCAGC TCGCIGGGH CTTCGGGTCT GGGCGGTGGG GTGGCCGCCA ACHGGGTCG 720 

GGCGGCCTCG GTACGGTATG GTCACCGGGA T6GCGGAAAA TATGCANAGT CTGGTCGGCG 780 

GAACGGTGGT CCGGCGTAA6 GTTTACCCCC GTTTTCTGGA TGCGGTGAAC HCGTCAACG 840 

GAAACAGTTA C 851 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:34: 
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6ATCGATC6G GCGGAAATTT GGACCAGAH CGCCTCC6GC GATAACCCAA TCAATCGAAC 
CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 
CGGGCACCT6 TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACACCC 
CCATCCAAAC 6TTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGAC6CAGT CGCAGGCTGC 
6CTTGGTCAA 6ATC 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 408 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:35: 

CGGCACGAGG ATCCTGACCG AAGC66CCGC CGCCAAGGCG AAGTCGCTGT TGGACCAGGA 
GG6ACGGGAC GATCTGGC3C TGCGGATCGC GGTTCAGCCG G6G6GGT6CG CTGGATTGCG 
CTATAACCTT HCTTCGACG ACCGGACGCT GGATGGT6AC CAAACCGCGG A6TTCGGTGG 
TGTCAGGTTG ATCGTGGACC GGATGAGCGC GCCGTATGTG GAAGGCGCGT CGATC6ATTT 
CGTCGACACT AHGAGAAGC AAGGNHCAC CATC6ACAAT CCCAACGCCA CCGGCTCCTG 
CGC6TGCGGG GAnCGTTCA ACTGATAAAA CGCTA6TACG ACCCCGCGGT GC6CAACACG 
TACGAGCACA CCAAGACCTG ACC6CGCT6G AAAAGCAACT GAGCGAT6 
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(2) INFORMATION FOR SEQ ID N0:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:36: 

GCGGTGTCGG CGGATCCGGC GGGTGGHGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 
GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGH 6TTCGGG6CC GGCGGGTCCG 
GCGGNGCCGG CACCAATGGT GGNGTCGGCG 6GTCCGGCGG ATTTGTCTAC GGCAACGGCG 



(2) INFORMATION FOR SEQ ID N0:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 60 

GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGT6GGCAG GGCGGCAATG 120 

GC6GCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GG6C6GTGGC GGAGGCAACG 180 
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CCCCGGAC6G CGGCTTCGGT GGCMCGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 
GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGT6A CGGCGGTGAC 
(2) INFORMATION FOR SEQ ID N0:38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(x1) SEQUENCE DESCRIPTION: SEQ ID N0:38: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA 6CAT 
(2) INFORMATION "OR SEQ ID N0:39: 

(i) SEQUEi^CE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:39: 

6ATCGCTGCT CGTCCCCCCC TTGCCGCC6A CGCCACCGGT CCCACCGHA CCGAACAAGC 
TGGCGTGGTC GCCAGCACCC CCGGCACC6C CGACGCCGGA GTCGAACAAT GGCACCGTCG 
TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 
(2) INFORMATION FOR SEQ ID N0:40: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:40: 
ATG6CGTTCA CGG6GCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGG6G TGG 53 
(2) INFORMATION FOR SEQ ID N0:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 
GATCCACCGC GGGTGCAGAC GGTGCCCGCG GC6CCACCCC 6ACCAGCGGC GGCAACGGCG 60 
GCACCGGCGG CAACGGCGCG AACGCCACCG TC6TCGGNGG GGCCGGCGGG 6CCGGCGGCA 120 
AGGGCGGCAA CG 132 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNA6CCA 
CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 
6CANC6GCGG CA 

(2) INFORMATION FOR SEQ ID N0:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 
CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGHCG CGATGCCGGC 
ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGC6ACGA TAATGGCTAT 
AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 240 
AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATG6C GGACCCACCG ACTGATGTCC 300 
CCATCACACC GT6CGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 360 
CCGACAACAT 6CGGGAATAC CTGGCGGCC6 GTGCCAAAGA 6CGGCAGCGT CTGGCGACCT 420 
CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCT6CG ACCGCGCTGG 480 
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ACAACGACGG CGAAGGAACT GTGCAGGCAG MTCGGCCGG GGCCGTCGGA GG6GACAGTT 540 

CGGCCGAACT AACCGATACG CCGA6GGT6G CCACGGCC6G TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC 6GCAAGGAAG CTCGAAAC6G GCGACCAAGG CGCATC6CTC GCGCACT6NG 660 

6GGATG6GTG GAACACTTNC ACCCTGACGC TGCAAGGCGA C6 702 
(2) INFORMATION FOR SEQ ID N0:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:44: 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGT66A 60 

GGC6GC6GGG TGCCGTCGGC GCCGTTGGGA TCCGC6ATCG 66GGCGCCGA ATC6GTGCGG 120 

CCCGCTGGCG CTGGTGACAT TGCCGGCHA GGCCAGGGAA GGGCCGGCGG CG6CGCC6C6 180 

CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AG6GGGCGCC 240 

AAGTCCAAGG GTTCTCAGCA 6GAAGAC6AG 6CGCTCTACA CCGAGGATCC TCGT6CCG 298 
(2) INFORMATION FOR SEQ ID N0:45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45: 

CGGCACGAGG ATCGAATCGC GTC6CCGGGA GCACAGCGTC GCACTGCACG AGTGGAGGAG 
CCAT6ACCTA CTCGCCGGGT AACCCC6GAT ACCCGCAAGC GCAGCCCGCA G6CTCCTACG 
GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 
T6AACATCGC GGTGGCAGTG CTCGGTCT6G CTGCGTACH CGCCAGCHC 6GCCCAATGT 
TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 
CGGTCG6GGT GGCTCTGCTG GCTGCGCTGC 7TGCCGGGGT GGTTCTGGTG CCTAAGGCCA 
AGA6CCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 
CGACGTTTAA CAAGCCCAGC GCCTAHCGA CCGGTTGGGC AnGTGGGTT GTGTTGGCTT 
TCATCGTGn CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 
CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 
ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 
CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 
AC6GCGGCTA TTCGTCCAGT CC6AGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 
CCCAGCCGCC 66CGCAGTCC 6GGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 

ccGGcnrcc gagcttcagc ccaccaccac cggtcagtgc cgggacgggg tcgcaggctg 

GTTC66CTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 
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G6GCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 
GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 1058 
(2) INFORMATION FOR SEQ ID N0:46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:46: 

CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTAATTTC GAGCGGATCT 60 

CC6GCGACCT GAAAACCCAG ATC6ACCAGG TGGAGTCGAC GGCAGGHCG HGCAGGGCC 120 

AGTGGCGCGG C6CGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 180 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCCGGCGTCC 240 

AATACTCGAG GGCCGACGAG 6AGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 300 

CCGCTAATAC GAAAAGAAAC GGAGCAA 327 
(2) INFORMATION FOR SEQ ID N0:47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 60 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 
TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 170 
(2) INFORMATION FOR SEQ ID N0:48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:48: 
GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 60 
CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 120 
GGGCCGT 127 
(2) INFORMATION FOR SEQ ID N0:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49: 
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CGGCGGCAAG GGCGGCACCG CCGGCMCGG GAGCGGCGCG GCCGGCGGCA ACGGC6GCAA 60 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(x1) SEQUENCE DESCRIPTION: SEQ ID N0:50: 
GATCAGGGCT GGCCGGCTCC GGCCA6AAGG GCGGTAACGG AGGAGCTGCC GGAnGTlTG 60 
6CAAC66C6G GGCCGGN6GT GCCG6C6CGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 120 
6AAACGGTGG TGCCGGTGGG CTGATCTGG 149 
(2) INFORMATION FOR SEQ ID N0:51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51: 
CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 60 



CGGCGGCTCC GGCCTCAACG G 



81 



ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 
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TCGAAGTACA GTCAAHCGA 6GCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 180 

CTATGAAAGT CGGCHCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGATAACTGA 240 

6GTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 300 

ACGGTGGCTC CGCCGA6GCG CTGCCTCCAA AATCCCT6CG ACAATTCGTC GGC6G 355 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:52: 

ATGCATCACC ATCACCATCA CATGCATCAG GTGGACCCCA ACHGACACG TCGCAAGGGA 60 

CGATTGGCGG CACTGGCTAT CGCGGCGATG GCCAGCGCCA GCCTGGT6AC CGTTGCGGTG 120 

CCCGCGACCG CCAAC6CCGA TCCG6AGCCA GCGCCCCCGG TACCCACAAC GGCCGCCTCG 180 

CCGCCGTCGA CCGCTGCAGC GCCACCCGCA CCGGCGACAC CTGTT6CCCC CCCACCACCG 240 

GCCGCCGCCA ACACGCCGAA TGCCCAGCCG GGCGATCCCA ACGCAGCACC TCCGCCGGCC 300 

GACCCGAACG CACCGCCGCC ACCTGTCATT GCCCCAAACG CACCCCAACC TGTCCGGATC 360 

GACAACCCG6 TTGGAGGAH CAGCTTCGCG CTGCCTGCTG GCTGGGTGGA GTCTGACGCC 420 

GCCCACTTCG ACTACGGTTC AGGACTCCTC AGGAAAACCA CCGGGGACCC GCCATTTCCC 480 

6GACAGCCGC CGCCGGT6GC CAATGACACC CGTATCGT6C TCGGCCGGCT AGACCAAAAG 540 
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CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCG6 CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 660 

GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAAC6GCC AGATCTGGAC GGGCGTAATC GGCTC6CCCG CGGCGAACGC ACCGGACGCC 780 

GGGCCCCCTC AGCGCTGGH TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 840 

GGCGCGGCCA AGGCGCT6GC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCT6A 999 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:53: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
15 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala He Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 
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Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val He Ala Pro 
100 105 110 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg ITe Val Leu Gly Arg 
165 170 175 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 185 190 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 205 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 



Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 240 
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Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp' Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 '270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gin Val 
1 5 10 15 



Val Ala Ala Leu 
20 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
1 5 10 15 

Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO:58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:58: 

Asp He Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:59: 



Ala 
1 



Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
5 10 
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(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEtoCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
1 5 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 62: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:63: 

Gly Cys Gly Asp Arg Ser Gly Gly Asn Leu Asp Gin He Arg Leu Arg 
1 5 10 15 

Arg Asp Arg Ser Gly Gly Asn Leu 
20 

(2) INFORMATION FOR SEQ ID N0:64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



SEQUENCE DESCRIPTION: SEQ ID N0:64: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
1 5 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala 
35 40 45 



Gly Gly Pro Val Val Tyr Gin Met 
50 55 

Leu Pro Leu Asp Pro Ala Ser Ala 
65 70 

Leu Thr Ser Leu Leu Asn Ser Leu 
85 

Asn Lys Gly Ser Leu Val Glu Gly 
100 

lie Ala Asp His Lys Leu Lys Lys 
115 120 

Leu Ser Phe Ser Val Thr Asn 
130 135 

Thr Ala Asp Val Ser Val Ser Gly 
145 150 



Gin Pro Val Val Phe Gly Ala Pro 
60 

Pro Asp Val Pro Thr Ala Ala Gin 
75 80 

Ala Asp Pro Asn Val Ser Phe Ala 

90 . 95 

Gly He Gly Gly Thr Glu Ala Arg 
105 110 

Ala Ala Glu His Gly Asp Leu Pro 
125 



Pro Lys Leu Ser Ser Pro Val Thr 
155 160 



He Gin Pro Ala Ala Ala Gly Ser Ala 
140 
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Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala" Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino adds 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
1 5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp He Phe Leu Asp Asp Val Thr Val 
65 70 75 80 



Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu 
85 90 



Phe Asn Val 
95 
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Val Asp Val Gly Ser Leu Asn 
100 

Asp 

Arg Leu Val Phe Leu Thr Gly 
130 135 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 66: 



Gly Thr Tyr Val Asn Arg Glu Pro Val 
105 110 



Pro Lys Gin Gly Glu Asp Asp Gly Ser 
140 



Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:66: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
1 5 10 15 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 
20 25 30 

Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin He Ser Arg Gin Ser 
35 40 45 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 60 
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Phe Asp Val Arg He Lys He Phe Met Leu Val Thr Ala Val Val Leu 
65 70 75 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 
85 90 95 

Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin He Gin Met Ser 
100 105 110 

Asp Pro Ala Tyr Asn He Asn He Ser Leu Pro Ser Tyr Tyr Pro Asp 
115 120 125 

Gin Lys Ser Leu Glu Asn Tyr He Ala Gin Thr Arg Asp Lys Phe Leu 
130 135 140 

Ser Ala Ala Thr Ser Ser Thr Pro Arg Glu Ala Pro Tyr Glu Leu Asn 
145 150 155 160 

He Thr Ser Ala Thr Tyr Gin Ser Ala He Pro Pro Arg Gly Thr Gin 
165 170 175 

Ala Val Val Leu Xaa Val Tyr His Asn Ala Gly Gly Thr His Pro Thr 
180 185 190 

Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro He 
195 200 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Val 
210 215 220 

Phe Pro He Val Ala Arg 
225 230 

(2) INFORMATION FOR SEQ ID N0:67: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(ff) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:67: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 
1 5 10 15 

Ala He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser 
20 25 30 

Gly Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly 
35 40 45 

Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 
50 55 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val 
65 70 75 80 

He Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 95 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 110 

Gin Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val thr Leu Ala Glu 
115 120 125 . 

Gly Pro Pro Ala 
130 
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(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:68: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
1 5 10 15 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 
20 25 30 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 
35 40 45 

Met Ala Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 

Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 
65 70 75 80 

He Gin Ser Thr Xaa He Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 95 

Ser Glu Arg Lys 
100 

(2) INFORMATION FOR SEQ ID N0:69: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 163 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Thr Asp Asp He Leu Leu He Asp Thr Asp Glu Arg Val Arg Thr 
^ S 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 40 45 

He Asp Val Val He Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 
50 55 60 

Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 
^5 70 75 80 

Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 
85 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 no 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arg 
130 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 
145 150 155 160 
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Asp Arg Arg 



(2) INFORMATION FOR SEQ ID N0:70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 amino adds 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:70: 

Met Lys Phe Val Asn His He Glu Pro Val Ala Pro Arg Arg Ala Gly 
1 5 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 
20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 

Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
85 90 95 

Gin Thr Asp Thr Ala Ala Ala He Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 110 
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Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 

115 120 125 

Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 135 140 

Leu Gly Thr Ala Val Gin Phe His Phe He Ala Arg Leu Val Leu Val 

150 155 160 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 170 175 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
180 185 190 

His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro He Ala Thr Ala Phe 
210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 
260 265 270 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 



Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 
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Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 310 315 320 

Ala Ala Arg Arg He Gly Thr Trp He Gly Ala Ala Ala Glu Gly Gin 
325 330 335 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID N0:71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
15 10 15 

Leu Gly Arg Gly He Ala Pro Val Glu Asp He Gin Asp Cys Val Glu 
20 25 30 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr He 
35 40 45 

He Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 55 60 

Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ala Val Thr Val Leu 
65 70 75 80 
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Arg 61 u Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 95 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 110 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
145 150 155 160 

He Glu Asp Ser Leu Gin Ser He Phe Ala Thr Leu Gly Gin Ala Ala 
165 . 170 175 

Glu Leu G- Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 240 

His Pro Asp He Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 



245 



250 



255 



Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 270 



wo 97/09429 PCT/US96/14675 

110 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 
275 280 285 

Gly Lys"Ile Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala He 
290 295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
305 310 315 320 

Thr He Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg He Glu Ala 
325 330 335 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Ser He Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Asp 
370 375 380 

Asp Val He Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 
385 390 395 400 

Ala Arg Ala Thr Arg Lys He Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 415 

Leu Leu Ala Ala Leu Gly He Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 



Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
450 455 460 
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Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 470 475 480 

Val Ala" Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:72: 

Gly Val He Val Leu Asp Leu 61u Pro Arg Gly Pro Leu Pro Thr Glu 
1 5 10 15 

He Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 
20 25 30 

Val Gly He Ala Val Ala He Val He Ala Phe Val Asp Ser Ser Ala 
35 40 45 

Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 



Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 
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Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu 61y Asp 
100 105 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 



He Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 205 

Cys Pro Leu Pro Arg Pro Ala He Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe He Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 73: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(ID TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:73: 

Leu He Ser Thr Gly Lys Ala Ser His Ala Sen Leu Gly Val Gin Val 
15 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu Val Val Ala 
20 25 30 

Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val Val Val Thr 
35 40 45 

Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu Val Ala Ala 
50 55 60 

Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr Phe Gin Asp 
65 70 75 80 

Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly Lys Ala Glu 
85 90 95 

Gin 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:74: 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
15 10 15 

Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Arg 
50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 
65 70 75 80 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 
85 90 . 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
100 105 no 

Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro He Ala 
115 120 125 

He Thr Tyr Asn He Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 
130 135 140 

Thr Thr Ala Lys He Phe Asn Gly Thr He Thr Val Trp Asn Asp Pro 
. 145 150 155 160 



Gin He Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro He 
165 170 175 
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Ser Val He Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser He Thr Tyr Asn Glu 
225 230 235 240 

Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin He He Thr 
245 250 255 

Ser Ala Gly Pro Asp Pro Val Ala He Thr Thr Glu Ser Val Gly Lys 
260 265 270 

Thr He Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala He Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 
355 360 
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(2) INFORMATION FOR SEQ ID N0:75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 
1 5 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 ■ 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 
50 55 60 

Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 
65 70 75 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 
85 90 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gly Arg 
100 105 110 



Gly Arg 



Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 
115 120 125 
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Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 135 140 

Gin His" Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 
145 150 155 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 
195 200 205 

Phe Arg Pro He Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arg Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 

Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
290 295 300 



Asn Arg Pro Arg Arg 
305 
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(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 
iX) LENGTH: 580 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:76: 

Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 
1 5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 
65 70 75 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 110 



Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 
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Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
145 150 155 160 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
■165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
180 185 190 

Gly Leu He Pro Gly Val He Pro Thr Met Thr Pro Pro Pro Gly Met 
195 200 205 

Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala He Gly Ala Val Thr 
210 215 220 



He Ala Val Val Ser Ala Gly He Gly Gly Ala Ala Ala Ser Leu Val 
225 230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
245 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
275 280 285 



Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly He He Leu Ser Ala 
290 295 300 

Glu Gly Leu He Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
305 310 315 320 
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Pro Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
340 345 350 

He Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro He Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala He 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 415 

Thr Val Leu Asp Ala He Gin Thr Asp Ala Ala He Asn Pro Gly Asn 
420 425 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 
435 440 445 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 
450 455 460 

Ser He Gly Leu Gly Phe Ala He Pro Val Asp Gin Ala Lys Arg He 
465 470 475 480 

Ala Asp Glu Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly 
485 490 495 



Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu 
500 505 510 
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Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 
515 520 525 

Val VafThr Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu 
530 535 540 

Val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 
545 550 555 560 

Phe Gin Asp Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly 
565 570 575 

Lys Ala Glu Gin 
580 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 
1 5 10 15 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 40 45 
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Ala Leu Leu Ala Glu He Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 
50 55 60 

Thr Ser~Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
65 70 75 80 

Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 
85 90 95 

Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 110 

Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
145 150 155 160 

Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys He Thr Gly Thr 
165 170 175 

He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly Ala Lys Ser Ala 
180 185 190 

Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser His His Leu Val 
195 200 205 

Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin Leu Thr Gin Ser 
210 215 220 



Lys Trp Asn Glu Pro Val Asn Val Asp 
225 230 
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(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 
(AriENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:78: 

Val He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 
15 10 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp He Arg Val 
20 25 30 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys He 
35 40 45 

Thr Tyr Arg He Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 
50 55 60 

Pro Arg 
65 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro He Ser 
1 ~ 5 10 15 

Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
50 55 60 

Ser Pro Pro Leu Pro 
65 

(2) INFORMATION FOR SEQ ID N0:80: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80: 

Met Ser As n Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 
15 10 15 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 
20 25 30 . 



Ala 



Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
35 40 45 
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Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 
50 55 60 

Asn He Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 80 

Gly He Val He Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 
85 90 95 

He Ala Gly Ala Thr Asp He Asn Ala Phe Ser Val Gly Ser Gly Gin 
100 105 110 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 
115 120 125 

Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala He Gly 
130 135 140 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 155 160 

Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 
165 170 175 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 
180 185 190 

Leu Asn Gly Leu He Gin Phe Asp Ala Ala He Gin Pro Gly Asp Ser 
195 200 205 



Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 
210 215 220 



Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
225 230 235 240 
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He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser Gly 
245 250 255 

Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly Leu 
260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 285 

Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val He 
290 295 300 

Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala Asp 
305 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 350 

Pro Pro Ala 
355 

(2) INFORMATION FOR SEQ ID N0:81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81: 



wo 97/09429 




PCT/US96/14675 



Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 
15 10 15 

Ala Ser" Asp Pro Ala Leu Leu Ala Glu He Arg Gin Ser Leu Asp Ala 
20 25 30 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 
65 70 75 80 

Val Pro Phe Arg Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 
100 105 110 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 120 125 

Leu Gin Ala Gin Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys 
130 135 140 

He Thr Gly Thr He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145 150 155 160 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser 
165 170 175 



His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
180 185 190 
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Leu Thr 61 n Ser Lys Trp Asn Glu Pro Val Asn Val Asp 
195 200 205 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:82: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
1 5 10 15 . 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu He Gly Tyr He Xaa Glu 
50 55 • 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn He Phe Phe 
65 70 75 80 

Tyr He Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 



Asn Phe Asp Pro Glu Gly Val Leu Gly Gly He Tyr Arg Tyr His Ala 
100 105 110 
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Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin He Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

■ Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 160 

Arg Asp Gly Val Val He Glu Thr Glu Lys Leu Arg His Pro Asp Arg 
165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pro Val He Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin He 
195 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
210 215 220 

Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
225 230 235 240 

Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn He Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 270 

Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 285 

(2) INFORMATION FOR SEQ ID N0:83: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:83: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin He His Asn Glu Phe Thr 
15 10 15 

Ala Ala Gin Gin Tyr Val Ala He Ala Val Tyr Phe Asp Ser Glu Asp 
20 25 30 

Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 40 45 

Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 
50 55 60 

Val Glu He Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 
65 70 75 80 

Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
85 90 95 

Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 
100 105 110 

Gly Glu Gin Phe Met Gin Trp Phe Leu Gin Glu Gin He Glu Glu Val 
115 120 125 



Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 
130 135 140 
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Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 
145 150 155 160 

Ala Ala^Ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 
165 170 

(2) INFORMATION FOR SEQ ID NO: 84: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:84: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser He 
15 10 15 

Ala Ala Gly Leu Thr Ala Ala Ala Ala He Gly Ala Ala Ala Ala Gly 
20 25 30 

Val Thr Ser He Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
65 70 75 80 



Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly He 
85 90 95 
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Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 
100 105 

(2) INFORMATION FOR SEQ ID N0:85: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:85: 

Val Leu Ser Val Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 
1 5 10 15 

Pro Leu Gly Gin Pro He Asp Gly Arg Gly Asp Val Asp Ser Asp Thr 
20 25 30 

Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 40 45 

Val Lys Glu Pro Leu Xaa Thr Gly He Lys Ala He Asp Ala Met Thr 
50 55 60 

Pro He Gly Arg Gly Gin Arg Gin Leu He He Gly Asp Arg Lys Thr 
65 70 75 80 

Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg Glu 
85 90 95 

Glu Leu Gly Val Arg Trp He Pro Arg Ser Arg Cys Ala Cys Val Tyr 
100 105 110 



Val Gly His Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:86: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
15 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 30 

Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 
35 40 45 

Ala Asp Leu Ala Glu He Lys Ala Gly Glu Ser Val Leu He His Ala 
50 55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin Trp 
65 70 75 80 

Gly Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 
85 90 95 

Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 110 



Arg Ser Ser Xaa Gly 
115 
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(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

Met Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys He Leu 
1 5 10 15 

Ala Thr Gly Val Ala Gly Leu Gly Val Gly Ala Gin Ser Ala Ala Gin 
20 25 30 

Thr Ala Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp 
35 40 45 

Pro Ala Trp Gly Pro Asn Trp Asp Pro Tyr Thr Cys His Asp Asp Phe 
50 55 60 

His Arg Asp Ser Asp Gly Pro Asp His Ser Arg Asp Tyr Pro Gly Pro 
65 70 75 80 

He Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro 
85 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 

(2) INFORMATION FOR SEQ ID NO: 88: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 88 amino acids 



wo 97/09429 




PCTAJS96/14675 



135 



(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:88: 

Val Gin Cys Arg Val Trp Leu Glu He Gin Trp Arg Gly Met Leu Gly 
15 10 15 

Ala Asp Gin Ala Arg Ala Gly Gly Pro Ala Arg He Trp Arg Glu His 
20 25 30 

Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 
35 40 45 

Thr Lys Glu Gly Arg Gly He Val Met Arg Val Pro Leu Glu Gly Gly 
50 55 60 

Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 
65 70 75 80 

Asp Glu Leu Lys Gly Val Thr Ser 
85 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89: 
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Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
1 5 10 15 

Ser Gly" Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) INFORMATION FOR SEQ ID NO:90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:90: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu Asn 
15 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 
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. Pro He Thr Pro Cys Glu Leu Thr Xaa Xaa Lys Asn Ala Ala Gin Gin 
35 40 45 

Xaa Vaf Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 .55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Xaa 
65 70 75 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 110 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly Trp Asn Thr Xaa Thr 
145 150 155 160 



Leu Thr Leu Gin Gly Asp 
165 

(2) INFORMATION FOR SEQ ID NO: 91: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91: 

Arg Ala Glu Arg Met 
1 " 5 

(2) INFORMATION FOR SEQ ID N0:92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:92: 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
15 10 15 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val He Ala Glu Asn Arg Ala Glu Leu Met He Leu 
35 40 45 

He Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala He Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 



Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 110 
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Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 140 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro He Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met 
180 185 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 

Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
210 215 220 

Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
245 250 255 

Arg Arg Asn Gly Gly Pro Ala 



260 



(2) INFORMATION FOR SEQ ID NO: 93: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:93: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
1 5 10 15 

Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 
20 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn He Ala Val Ala Val Leu Gly 
35 40 45 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 
50 55 60 

Glu Leu Gly Gly Gly Asp Gly Ala Val Ser Gly Asp Thr Gly Leu Pro 
65 70 75 80 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 
85 90 95 

Pro Lys Ala Lys Ser His Val Thr Val Val Ala Val Leu Gly Val Leu 
100 105 110 

Gly Val Phe Leu Met Val Ser Ala Thr Phe Asn Lys Pro Ser Ala Tyr 
115 120 125 

Ser Thr Gly Trp Ala Leu Trp Val Val Leu Ala Phe He Val Phe Gin 
130 135 140 



Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Gly Ala He Thr 
145 150 155 160 
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Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
165 170 175 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 185 190 

Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 

Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tyr Ser Asn 
275 280 285 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Met Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala Ala Ala Ala 
1 ~ 5 10 15 

He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala Gly Gly Pro 
20 25 30 

Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
50 55 60 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
65 70 75 80 

Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg He Ala Asp 
85 90 95 

His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 
100 105 110 

Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 
115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
130 135 140 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 
145 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) INFORMATION FOR SEQ ID NO: 95: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:95: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
15 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala He Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro. Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val He Ala Pro 
100 105 110 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 



Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 
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Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg He Val Leu Gly Arg 
165 170 175 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 185 190 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 205 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 240 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 



Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 
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(2) INFORMATION FOR SEQ ID N0:96: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:96: 
CGTGGCAATG TCGHGACCG TCGGG6CCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 
CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG 
GGCTGCCGCA CAGHCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 
CGCACC6CCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 
ACAGTACATC GGCCTTGTCG AGTCGGHGC CGGCTCCTGC AACAACTAH AAGCCCATGC 
GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGAHGCC CCGCTCCTCA 
ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGAHG CCCCGCTCCT 
CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 
GCCGCCACCG CGGTGGAGCT 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



60 
120 
180 
240 
300 
360 
420 
480 
500 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:97: 

Val Ala Met Sen Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro 
1 5 10 15 

Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 
35 40 45 

Pro Val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 
50 55 60 

Gin Arg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala 
65 70 75 80 

Gin Tyr He Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 



60 
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AATGTCACGT CCATTCATTC CCTCCTTGAC GAG6G6AAGC AGTCCCTGAC CAAGCTCGCA 
GCGGCCTGGG GCGGTAGC6G TTCGGAAGCG TACC 
(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:99: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly He Glu Ala Ala Ala Ser 
15 10 15 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 
20 25 30 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser 
35 40 45 

Glu Ala Tyr 
50 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



120 
154 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
CGGTCGCGCA CnCCAGGTG ACTATGAMG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 60 

mm 

TCAAGCGCGG CCGATAACTG AGGTGCATCA HAAGCGACT TTTCCAGAAC ATCCTGACGC 120 

GCTC6AAACG CGGCACAGCC GACGGT6GCT CC6NCGAGGC 6CTGNCTCCA AAATCCCTGA 180 

GACAAHCGN CGG6GGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 240 

ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 282 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1565 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

GTATGCGGCC ACTGAAGTCG CCAATGCGGC GGCGGCCAGC TAA6CCAGGA ACAGTCGGCA 60 

CGAGAAACCA CGAGAAATAG GGACACGTAA TGGT6GATTT CGGGGCGTTA CCACCGGAGA 120 

TCAACTCCGC GAGGATGTAC GCCGGCCCGG GTTCGGCCTC GCTGGTG6CC GCGGCTCAGA 180 

TGTGGGACAG CGTGGCGAGT GACCTGTTTT CGGCCGCGTC GGCGTTTCAG TCGGTGGTCT 240 

GGGGTCTGAC GGTGGGGTCG TGGATAGGH CGTCGGCGGG TCTGATGGTG GCGGCGGCCT 300 

CGCCGTATGT GGCGTGGATG AGCGTCACCG CGGGGCAGGC CGAGCT6ACC GCCGCCCAGG 360 

TCCGGGHGC TGCGGCGGCC TACGAGACGG CGTATGGGCT GACGGTGCCC CCGCCGGTGA 420 
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TCGCCGAGAA CCGTGCTGM CTGATGATTC TGATAGCGAC CAACCTCTTG GGGCAAAACA 480 

CCCCGGCGAT CGCGGTCAAC GAGGCCGAAT ACGGCGAGAT GTGG6CCCAA GACGCCGCCG 540 

C6ATGTTTGG CTACGCCGCG GCGACG6C6A CGGCGACGGC GACGTTGCTG CCGHCGAGG 600 

AGGCGCCGGA 6ATGACCAGC GCGGGTGGGC TCCTC6AGCA GGCCGCCGCG 6TCGAGGAGG 660 

CCTCCGACAC CGCCGCGGCG AACCAGHGA TGAACAATGT GCCCCAGGCG CTGCAACAGC 720 

TGGCCCAGCC CACGCAGGGC ACCACGCCTT CTTCCAAGCT GGGTGGCCTG TGGAAGACGG 780 

TCTCGCCGCA TCGGTCGCCG ATCA6CAACA TGGTGTCAAT GGCCAACAAC CACATGTCAA 840 

TGACCAACTC GGGTGTGTCA ATGACCAACA CCHGAGCTC GATGTTGAAG GGCTTTGCTC 900 

CGGCGGC66C CGCCCAG6CC GTGCAAACCG CGGCGCAAAA CGGGGTCCGG GCGATGAGCT 960 

CGCJGGGCAG CTCGCTGGGT TCTTCGGGTC TGGGCGGTGG GGTGGCCGCC AACTTGGGTC 1020 

GGGCGGCCTC GGTCGGTTCG TTGTCGGTGC CGCAGGCCTG GGCCGCGGCC AACCAGGCAG 1080 

TCACCCCGGC GGC6CGG6CG CTGCCGCTGA CCA6CCTGAC CAGCGCCGCG GAAAGAGGGC 1140 

CCGGGCAGAT GCTGGGCGGG CTGCCGGTGG GGCAGATGGG CGCCAGGGCC GGTGGTG6GC 1200 

TCAGTGGTGT GCTGCGTGTT CCGCCGCGAC CCTATGTGAT GCCGCATTCT CCGGCGGCCG 1260 

GCTAGGAGAG GGGGCGCAGA CTGTCGTTAT TTGACCAGTG ATCGGCGGTC TCGGTGTTTC 1320 

CGCGGCCGGC TATGACAACA GTCAATGTGC ATGACAAGTT ACAGGTATTA GGTCCA6GTT 1380 

CAACAAGGAG ACAGGCAACA TGGCCTCACG TTTTATGACG GATCCGCACG CGATGCGGGA 1440 

CATGGCGGGC CGTnTGAAG TGCAC6CCCA GACGGTGGAG GACGAGGCTC GCCGGATGTG 1500 
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6GCGTCCGCG CAAAACATTT CCGGTGCGGG CTGGAGTGGC ATGGCCGAGG CGACCTCGCT 
AGACA 

(2) INFORMATION FOR SEQ ID N0:102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:102: 

Met Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
I 5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 110 
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Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
145 150 155 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 

Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro He Ser Asn 
225 230 235 240 

Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 



Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly 
290 295 300 
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Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
305 310 315 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 
325 330 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 
340 345 350 

Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Arg Ala Gly 
355 360 365 

Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 380 

Pro His Ser Pro Ala Ala Gly 
385 ' 390 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 259 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: 
ACCAACACCT T6CACTCNAT GHGAAGGGC TTAGCTCCGG CGGCGGCTCA GGCCGTGGAA 60 
ACCGCG6C6G AAAACGGGGT CTGGGCAATG AGCTCGCTGG GCAGCCAGCT GGGTTCGTCG 
CTGGGnCTT CGGGTCTGGG CGCTGGGGTG 6CCGCCAACT TG6GTCGGGC GGCCTCGGTC 



120 
180 
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GGTTCGTTGT CGGTGCCGCC AGCATGGGCC GCGGCCMCC AGGCGGTCAC CCCGGCGGCG 240 
CGGGCGCTGC CGCTGACCA 259 
(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Thr Asp Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala Ala Ala 
lb 10 15 

Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met Ser Ser 
20 25 30 

Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Ala 
35 40 45 

Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser 
50 55 60 

Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala 
65 70 ,75 80 

Arg Ala Leu Pro Leu Thr 
85 



(2) INFORMATION FOR SEQ ID NO: 105: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1109 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D") TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

TACTT6AGAG AATTTGACCT GHGCCGACG TTGTTTGCTG TCCATCATTG GTGCTAGTTA 60 

T6GCCGAGCG GAAGGAHAT CGAAGTGGTG GACTTCGGGG CGTTACCACC GGAGATCAAC 120 

TCC6CGA6GA TGTACGCCGG CCCGG6TTCG GCCTCGCTGG TGGCCGCCGC GAA6ATGTG6 180 

GACAGC6TGG CGAGT6ACCT GTTTTC6GCC GCGTCGGCGT TTCAGTCGGT GGTCTGGGGT 240 

CT6ACGACGG GATCGTGGAT AGGHCGTCG GCGGGTCTGA TGGTGGCGGC GGCCTCGCCG 300 

TATGTGGCGT GGATGAGC6T CACCGCGGGG CAG6CCGAGC TGACCGCCGC CCAGGTCCGG 360 

GTTGCTGCGG CGGCCTACGA GACGGCGTAT 6GGCTGACGG TGCCCCCGCC GGTGATCGCC 420 

6A6AACCGTG CTGAACTGAT GATTCTGATA GCGACCAACC TCTTGGGGCA AAACACCCCG 480 

GCGATCGCGG TCAACGAGGC CGAATACGGG GAGATGTGGG CCCAAGACGC CGCCGC6ATG 540 

TTTGGCTACG CCGCCACGGC GGCGACGGCG ACCGAGGCGT TGCTGCCGTT CGAGGACGCC 600 

CCACTGATCA CCAACCCC6G CG6GCTCCTT GAGCAGGCCG TCGCGGTCGA 6GAGGCCATC 660 

GACACCGCCG C6GCGAACCA GHGATGAAC AATGTGCCCC AAGCGCTGCA ACAACT6GCC 720 

CAGCCCACGA AAAGCATCTG GCCGTTCGAC CAACTGAGTG AACTCTGGAA A6CCATCTCG 780 

CCGCATCTGT CGCCGCTCAG CAACATCGTG TCGATGCTCA ACAACCACGT GTCGATGACC 840 
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AACTCGGGTG TGTCAATG6C CAGCACCTTG CACTCMTGT TGAAGGGCTT TGCTCCGGCG 900 

GCG6CTCAGG CCGTGGAAAC CGCG6CGCAA AACGGGGTCC AGGCGAT6AG CTCGCTGGGC 960 

AGCCAGCTGG GTTCGTCGCT GGGHCTTCG GGTCTGGGCG CTGGGGTGGC CGCCAACTTG 1020 

6GTC6GGCGG CCTCGGTCGG nCGHGTCG GTGCCGCAGG CCTGG6CCGC GGCCAACCAG 1080 

GCGGTCACCC CGGCGGCGCG GGCGCTGCC HOg 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 341 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
15 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 
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Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu He Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala He 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn,Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Lys Ser He Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala He Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

He Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val * 
245 250 255 



Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 
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Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Sen SerLeu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu 
340 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1256 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:107: 

CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGNTA AATACCGCAC 60 

GGCTGATGGC CGGCGCGGGT CCGGCTCCAA TGCHGCGGC G6CCGC6G6A TGGCAGACGC 120 

TTTCGGCGGC TCTGGACGCT CAGGCCGTCG AGHGACCGC GCGCCTGAAC TCTCTGGGAG 180 

AAGCCT6GAC TGGAGGTGGC AGCGACAAGG CGCTTGCGGC T6CAACGCCG ATGGTGGTCT 240 

G6CTACAAAC C6C6TCAACA CAGGCCAAGA CCC6TGCGAT GCAGGCGACG GCGCAAGCCG 300 
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C66CATACAC CCAGGCCATG 6CCACGACGC CGTCGCTGCC GGAGATCGCC GCCAACCACA 360 

TCACCCAGGC CGTCCTTACG GCCACCAACT TCTTCG6TAT CAACAC6ATC CC6ATCGCGT 420 

TGACCGAGAT GGAnATTTC ATCCGTATGT GGAACCAGGC AGCCCTGGCA ATGGAGGTCT 480 

ACCAGGCCGA GACCGCGGTT AACACGCTTT TCGA6AAGCT CGAGCCGATG GCGTCGATCC 540 

TTGATCCCGG CGCGA6CCAG AGCACGACGA ACCCGATCTT CGGAATGCCC TCCCCTGGCA 600 

6CTCAACACC GGHGGCCAG HGCCGCCGG C6GCTACCCA GACCCTCGGC CAACTGG6TG 660 

A6ATGAGCGG CCCGATGCAG CAGCTGACCC AGCCGCTGCA GCA6GTGACG TCGnGTTCA 720 

GCCAGGTGGG CGGCACCGGC GGCGGCAACC CAGCCGACGA GGAAGCCGCG CAGATGGGCC 780 

TGCTCGGCAC CA6TCCGCTG TCGAACCATC CGCTGGCT6G TGGATCA6GC CCCAGCGCGG 840 

6CGCGGGCCT 6CTGC6CGC6 6AGTCGCTAC CTGGCGCAGG TGGGTCGTT6 ACCCGCACGC 900 

CGCTGATGTC TCAGCTGATC GAAAAGCCGG TT6CCCCCTC GGTGATGCCG GCGGCTGCTG 960 

CCGGATCGTC GGCGACG6GT GGCGCCGCTC CGGTGG6TGC GGGA6CGATG GGCCAGGGTG 1020 

CGCAATCCGG CGGCTCCACC AGGCCGGGTC TGGTCGCGCC GGCACCGCTC GCGCAGGAGC 1080 

6TGAAGAAGA CGACGAGGAC GACTGGGACG AAGAGGACGA CTGGTGAGCT CCCGTAATGA 1140 

CAACAGACTT CCCGGCCACC CGGGCCGGAA GACTTGCCAA CATTrTGGCG AGGAAGGTAA 1200 

AGAGAGAAA6 TAGTCCAGCA TG6CAGAGAT GAAGACCGAT GCCGCTACCC TC6C6C 1256 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 432 base pairs 



wo 97/09429 




PCT/US96/14675 



(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:108: 
CTAGTGGATG GGACCATGGC CATITTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 
GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC T6C6CTGCCA TATCGTCC6G 
AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TC6GCCGCCA TGACAACCTC 
TCAGAGTGCG CTCAAACGTA TAAACACGA6 AAA6GGC6AG ACCGACGGAA GGTCGAACTC 
GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGC6T TGCCCTAT6C GAACATCCCA 
GTGACGHGC CTTCGGTCGA AGCCAHGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG. 
nCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATmTGCTG GACACCCTGG 
TACGCCTCCG AA 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:109: 

Met Leu Trp His Ala Met Pro Pro 61 u Xaa Asn Thr Ala Arg Leu Met 
15 10 15 



60 
120 
180 
240 
300 
360 
420 
432 
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Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 40 45 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 55 60 

Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala Ser Thr 
65 70 75 80 

Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90 95 

Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu He Ala Ala Asn 
100 105 110 

His He Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly He Asn 
115 120 125 

Thr He Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 • 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 
165 170 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 

Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 
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Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin G1n Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 

Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
355 360 365 



(2) INFORMATION FOR SEQ ID NO: 110: 



(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:110: 



Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:111: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGAC6GCAG GTTCGTTGCA 60 

GG6CCAGT6G CGCGGCGCGG CGGGGAC66C CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 120 

AGCAGCCAAT AAGCAGAAGC AGGAACTCGA C6AGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGC6 CTGTCCTCGC AAATGGGCTT 240 

CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 300 



GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 360 



CTTGACGAGG 



GGAAGCAGTC CCTGACCAA6 



CTCGCA 



396 



(2) INFORMATION FOR SEQ ID NO: 112: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

He Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala 
15 10 15 

Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 
20 25 30 

Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 
35 40 45 

Leu Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 70 75 80 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
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GTGGATCCCG ATCCCGTGTT TCGCTAHCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 60 

TCCCAGTGAC (JTT6CCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCT6 ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG HCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 180 

CCTGGTACGC CTCCGAACCG CTACC6CCCC AGGCCGCTGC 6A6CTTGGTC AGGGACTGCT 240 

TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 300 

CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TITGCTCCGT nCTTTTCGT 360 

A7TAGC6GGT CAGAAGCCCA TTTGCGA 337 
(2) INFORMATION FOR SEQ ID N0:114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

CGGCACGAGG ATCTCGGHG GCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGG6GCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCG6GAG 180 

TGHGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGAC6GTCT GGACGGAACG 240 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 272 
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(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:115: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val 
1 5 10 ■ 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 116: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:116: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 117: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 



Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:118: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



1 



5 



10 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:119: 
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Asp He Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:121: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
1 5 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 122: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:122: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
1 5 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEO ID NO: 124: 

. Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 
1 5 10 15 

Gly Gly Arg Arg Xaa Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:125: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FEATURE: 
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(D) OTHER INFORMATION: /note= "The Second Residue Can Be Either a 
Pro or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Third Residue Can Be Either a 

Gin or Leu" 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:127: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid* 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Xaa Xaa'Xaa 61 u Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Xaa Asp Ser Glu Lys Ser Ala Thr He Lys Val Thr Asp Ala Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Ala Gly Asp Thr Xaa He Tyr He Val Gly Asn Leu Thr Ala Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID N0:131: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(0 STRANDEDNESS: 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Ala Pro 61 u Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:132: 

Xaa Tyr He Ala Tyr Xaa Thr Thr Ala Gly He Val Pro Gly Lys He 
15 10 15 

Asn Val His Leu Val 
20 
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Claims 

J.. A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-GIn- 
Val-Val-AIa-Ala-Leu (SEQ ID No. 1 15); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-AIa-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID No. 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID No. 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID No. 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-GIu-Asp-Gbi-Gb-Xaa-Ala-Val (SEQ ID 
No. 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
No. 120); 

(g) Asp-Pro-GIu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser (SEQ ID No. 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-GIy 
(SEQ ID No. 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID No. 123); and 

(j) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-GIy-Gly-Thr-Val-Ghi-Ala-Gly; 
(SEQ ID No. 131) 
>^erein Xaa may be any amino acid. 
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2. A polypeptide comprising an immunogenic portion of an 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-Gk-Xaa-Asp-Met-Thr-Lys-GIy-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 124) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr^Ala-Gly-IIe-Val-Pr^^^ 
Asn-Val-His-Leu-Val; (SEQ ID No. 132), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions iand/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from tiie group consisting of the sequences recited in 
SEQ ID Nos. 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID Nos. 1, 2, 4-10, 13-25, 52, 94 and 
96 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an antigenic portion of a M tuberculosis 
antigen, or a variant of said antigen that differs only in conservative substitutions and/or 
modifications, wherein said antigen comprises an amino acid sequence encoded by a DNA 
sequence selected from the group consisting of the sequences recited m SEQ ID Nos. 26-51, 
the complements of said sequences, and DNA sequences that hybridize to a sequence recited 
in SEQ ID Nos. 26-51 or a complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 

6. A recombinant expression vector comprising a DNA molecule 
according to claim 5. 
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7. A host cell transformed with an expression vector according to claim 6. 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of £1 colU yeast and mammalian cells. 

9. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides 
according to any of claims 1-4; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

10. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences provided in SEQ ID No: 
129 and 130; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological sample. 

11. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides encoded 
by a DNA sequence selected from the group consisting of SEQ ID Nos. 3, 1 1 and 12, the 
complements of said sequences, and DNA sequences that hybridize to a sequence recited in 
SEQ ID Nos. 3, 1 1 and 12; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 
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12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD M tuberculosis antigen and step (b) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 
38 kD M tuberculosis antigen. 

13. The method of any one of claims 9-1 1 wherein the polypeptide(s) are 
bound to a solid support. 

14. The method of claim 13 wherein the solid support comprises 
nitrocellulose, latex or a plastic material. 

15. The method of any one of claims 9-1 1 wherein the biological sample is 
selected from the group consisting of whole blood, serum, plasma, saliva, cerebrospinal fluid 
and urine. 

16. The method of claim 15 wherein the biological sample is whole blood 

or serum. 

17. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with a fu^t and a second oligonucleotide primer 
in a polymerase chain reaction, the first and the second oUgonucleotide primers comprising at 
least about 10 contiguous nucleotides of a DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the first and second oligonucleotide primers, thereby detecting M tuberculosis infection. 

18. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with a first and a second oligonucleotide primer 
in a polymerase chain reaction, the first and the second oligonucleotide primers comprising at 
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least about 10 contiguous nucleotides of a DNA sequence selected from the group consisting 
of SEQ ID Nos. 3, 1 1 and 12; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the first and second oligonucleotide primers, thereby detecting M. tuberculosis infection. 

19. The method of claims 17 or 18 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 



20. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes 
comprising at least about 15 contiguous nucleotides of a DNA molecule accordmg to claim 5; 
and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M tuberculosis infection. 

21. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes 
comprising at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting of SEQ ID Nos. 3, 1 1 and 12; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 

22. The method of claims 20 or 21 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, senmi, plasma, saliva, 
cerebrospinal fluid and urine. 
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23. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

Ja) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide according to any one of claims 1-4; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample, 

24. A method for detectmg M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of bindmg to a polypeptide having an N-terminal sequence selected from the group consisting 
of sequences provided in SEQ ID No: 129 and 130; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detectmg M tuberculosis infection in the biological sample. 

25. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide encoded by a DNA sequence selected from the group consisting 
of SEQ ID Nos. 3, 1 1 and 12, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited m SEQ ID Nos. 3, 1 1 and 12; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

26. The method of any one of claims 23-25 wherein the binding agent is a 
monoclonal antibody. 

27. The method of any one of claims 23-25 wherein the binding agent is a 
polyclonal antibody. 
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A di^ostic kit comprising: 

one or more polypeptides according to any of claims 1-4; and 
a detection reagent 

29. A diagnostic kit comprising: 

(a) one or more polypeptides having an N-terminal sequence selected from 
the group consisting of sequences provided in SEQ ID No: 129 and 130; and 

(b) a detection reagent 

30. A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DNA sequence selected from 
the group consisting of SEQ ID Nos. 3, 1 1 and 12, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos. 3, 1 1 and 12; and 

(b) a detection reagent 

31. The kit of any one of claims 28-30 wherein the polypeptide(s) are 
immobilized on a solid support. 

32. The kit of claim 31 wherein the solid support comprises nitrocellulose, 
latex or a plastic material. 

33. The kit of any one of claims 28-30 wherein the detection reagent 
comprises a reporter group conjugated to a binding agent. 

34. The kit of claim 33 wherein the binding agent is selected from the 
group consisting of anti-unmimoglobulins. Protein G, Protein A and lectins. 



28. 
(a) 



35. The kit of claim 33 wherein the reporter group is selected from the 
group consisting of radioisotopes, fluorescent groups, luminescent groups, enzymes, biotin 
and dye particles. 
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36. A diagnostic kit comprising a first polymerase chain reaction primer 
and a second polymerase chain reaction primer, the first and second primers each comprising 
at least about 10 contiguous nucleotides of a DNA molecule according to claim 5, 

37. A diagnostic kit comprising a first polymerase chain reaction primer 
and a second polymerase chain reaction primer, the first and second primers each comprising 
at least about 10 contiguous nucleotides of a DNA sequence selected from the group 
consisting of SEQ ID Nos. 3, 1 1 and 12. 

38. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe comprising at least about 15 contiguous nucleotides of a DNA 
molecule according to claim 5. 

39. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe comprising at least about 15 contiguous nucleotides of a DNA 
sequence selected from the group consisting of SEQ ID Nos. 3, 1 1 and 12. 

40. A monoclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 

41. A polyclonal antibody that binds to a polypeptide according to any of 

claims M. 

42. A fusion protein comprising two or more polypeptides according to 
any one of claims M. 

43. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6 (SEQ ID No. 99), 
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44. A fusion protein comprising a polypeptide having an N-tenninal 
sequence selected from the group of sequences provided in SEQ ID Nos. 129 and 130. 
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